THE GENETIC SEQUENCE BINARY FACTOR GROUPING ROUTINES

The relation between the time of creation and the time of the life of the creation is nothing
but the borderline between the human truth about the creation and the nature of the creation ...





The Genetic sequence binary factor grouping is a set of routines that computes V3 CEL scan data producing catalogue(s) of genetic sequence data , for the purpose of genetic analysis , generating large catalogues of genetic sequence families of curves .(above and bellow - partial display of groups of curves extracted from lung cancer datasets - lung normal )

Above :
(1) permutations count - single permutation group ,
(2) 4 cell line scans (19 scans) , single permutation group
(3) scan file values merit factor curve , single permutation group permutations chart , single permutation chart


Genes were referenced by scan measured values number serie merits ( M.J.E. Golay (1902-1989) from The Merit Factor Problem by Jedwab at http://www.math.sfu.ca/~jed/Research/merit.html ) .

Chart from CL2002042639AACEL_merits_x32x16x8x4x2x1_b16d_41_.xls column A contains number scale , sorted column B contains numbers merits as computed from CL2002042639AA.CEL .
Merit factor values were computed with b16d_41 from CL2002042639AA.CEL .

Binary factor grouping merits : (example bellow computed by principia routine binary_factor_number_merit_scale_routine) and permutation chart from CL2001032255AA.CEL

* Upper charts computed by binary_factor_number_merit_scale_routine


Binary factor grouping merits computed with binary_factor_number_merit_scale_routine (upper charts top to bottom) :
(1) text file example (computer log)
(2) wav file (voice recording)
(3) single probe scan .CEL file
(4) merit values chart
(5) scan CEL file mean column values chart


Merit factor curve (left) vs binary factor merit factor curve (right) - binary factor number merit scale routine (1) vs binary factor number merit scale routine (2)


Binary factor merit factor curve logn = 0 region , Y-axis absolute measured values




ADENO_1 dataset 1173 genes merited values chart(s) , 4 of 19 , as produced by binary_factor_number_merit_scale_routine___ and rightmost resulting chart and related Blast searches and
Mean sqare values chart from CL2001032002AA.CEL , as computed by b16d_43 , 1588 values ,


merit factor values from 6.1 - 6.6 , list of gi(s) : 34815 , 285942 , 457784 , 460085 , 663009 , 1377762 , 2662150 , 3036839 , 3043565 , 3046899 , 3165456 , 3327037 , 3413799 , 3766196 , 4371373 , 5262535 , 5743356 and Blast result Figure (.) and taxonomy


Multicolumn values from linear binary factor intervals produced by binary factor number scale routine


CEL file header :
DatHeader=[0..46124] CL2002042636AA:CLS=4733 RWS=4733 XIN=3 YIN=3 VE=17 2.0 04/26/02 14:38:50 HG-U133A.1sq 6
Algorithm=Percentile
AlgorithmParameters=Percentile:75;CellMargin:2;OutlierHigh:1.500;OutlierLow:1.004
Left chart selected interval -0.29 to +0.29 and right chart - mean square measured values having :
(1) 41.079720 , 42.067210 , 43.054707 , 44.042200 45.029694
(2) 58.064606 , 60.039590 , 61.027084 , 62.014576 , 63.002070
(3) 76.036980 , 77.024475 , 78.011970

Sequence (1) : gi(s) 464183 , 5362869 , 5856829 , 8922743 Figure (.) , Figure (..) and taxonomy

Number permutations grouping generated chart using num_c_perm_2_cel from CL2001032612AA.CEL (bellow)
(1) Resulting table (sorted) coumns A and B and (yellow) frequency line
(2) Peak frequency group list and real (intensity) values chart
(3) Intensity values and linearity chart
(4) Comparison of charts from lung cancer


- Lung normal results (7 rows) and their intensity values from 17 probes
- Example procedure for a single CEL file (CL2001031609AA.CEL from lung cancer (small cels)
- A single CEL file (N01_normal.CEL from prostate normal)
- Lung adenocarcinoma ADENO_1

Following results computed with b16d_43
Gene annotations were queried by accession numbers by http://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl
NCBI gi numbers were queried by gene annotations by http://www.ncbi.nlm.nih.gov/nuccore?term=SMAD4 using an automated PHP script url_get.txt .
Factor grouping permutation chart derived by a31_t.cpp and chart values linear crossection used for number range(s) selection .

- Charts :
()Title : Lung adenocarcinoma ADENO_1 chart in adeno1_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_2 chart in adeno2_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_3 chart in adeno3_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_4 chart in adeno4_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_5 chart in adeno5_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_6 chart in adeno6_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_7 chart in adeno7_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_8 chart in adeno8_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_9 chart in adeno9_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_10 chart in adeno10_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_1-10 chart in adeno_1-10_20110516 (txt) .
()Title : Prostate tumor Prostate Tumor Sample CEL files T01-T30 and T31-T60 chart in pc_20110517 (txt) .
()Title : Haining Lab : Human Naive and Memory T cell Subsets chart 90 probes in tmap_1-90_20110329 (txt) .
()Title : Gene expression-based chemical genomics identifies rapamycin as a modulator of MCL-1 and glucocorticoid resistance in leukemia 29 probes results chart in armstrong_part_20110721_1 (txt) .
()Title : Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors PT_SCANS_1 chart in pt1_20110305 (txt) .

Blast (www.ncbi.nih.gov) searches produce long genomic regions and numerous unmapped sequences .
Analysis of fasta coded genomic regions may get easier by searches based on single sequence permutations .
Determining typical sequences in a long genimic region results in numerous SNP s .
num_c_perm_2_fasta is locating long permuting sequences in a fasta coded genomic region .
Two runs (eg: num_c_perm_2_fasta LOC100653050.txt c > a.txt and num_c_perm_2_fasta LOC100653050.txt d > b.txt) would output permutation index arrays and fasta enumerated groups at the top of the output list .
The permutation index array appearing on the end of the output list requires inserting in a document calc utility and producing
a four column line chart . (Bellow) Line chart describing permutation groups , red and blue line out of sorted columns A and B , yellow line frequency count .


The permutation index array appearing at the beggining of the output list may be searched for composite index sequences .
Their line numbers would reference corresponding fasta sequences in the line enumerated fasta array .
Although having same index sequence numbers they will reference permutated fasta sequences .






17 CCTGGGGCTC
18 CTGGTCCCAGGT
19 GTTTCTAAGAAG
20 CCATCACTCTCA
21 GTGCAGCCGGGT

Further analysis would require Blast (www.ncbi.nih.gov) searches having these fasta sequences as input .
Above mRNA data was published by the Cancer Program Datasets title at www.broadinstitute.org .
Subsequent computations were motivated by the article (published by www.ncbi.nih.gov) :

References

On the complexity measures of genetic sequences.

Gusev VD, Nemytikova LA, Chuzhanova NA.
Institute of Mathematics, Siberian Branch of Russian Academy of Science, Novosibirsk, Russia. gusev@math.nsc.ru

MOTIVATION: It is well known that the regulatory regions of genomes are highly repetitive. They are rich in direct, symmetric and complemented repeats, and there is no doubt about the functional significance of these repeats. Among known measures of complexity, the Ziv-Lempel complexity measure reflects most adequately repeats occurring in the text. But this measure does not take into account isomorphic repeats. By isomorphic repeats we mean fragments that are identical (or symmetric)
modulo some permutation of the alphabet letters(.).
RESULTS: In this paper, two complexity measures of symbolic sequences are proposed that generalize the Ziv-Lempel complexity measure by taking into account any isomorphic repeats in the text (rather than just direct repeats as in Ziv-Lempel). The first of them, the complexity vector, is designed for small alphabets such as the alphabet of nucleotides. The second is based on a search for the longest isomorphic fragment in the history of sequence synthesis and can be used for alphabets of
arbitrary cardinality(..). These measures have been used for recognition of structural regularities in DNA sequences. Some interesting structures related to the regulatory region of the human growth hormone are reported.

(.) and (..) were marked here

Routine sources are included in genetic_sequence_binary_factor_grouping.zip ( at 20.12.2010 ) in the following execution order : genetic_sequence_binary_factor_grouping.txt

These routines use a permuted number index order based on a merit factor attributed number scale .
The merit factor attributed number scale is computed using scan numbers as 4x4bit binary factor indexes in an 16bit integer array .
The exponents of array sub group counts and corresponding factor sums in relation to actual number serie factor bounds produce the merit factor number attributes .
All computed data and corresponding sequences depend on multiplier values .
Multiplier values determine and abstract ranges and contents of the permutation groups .


(*1) an implementation of compression .
(*2) a simple enthropy measurement 1 and 2.
(*3) see text enthropy sequence in enthropy_text_sequence.txt .
(*4) arbitrary lengths of text sequences in (1) , (2) , (3) and in (4) , (5) and permuting sums composed of 1 , 10 and 30 in one byte numbers (-4~+4) generator code
(*5) similar text sequence repeats matching as in (1) and in (2).
(*6) permutation groups of text sequences in (1) and (2)
(*7) 16-bit parity patterns in (1) and text sequence patterns in (2)
(*8) Text patterns , whole text count for letters and variances , from (1) and
text search by variance (.) templates (where '_' stands for any single letter) eg. T__G__CAAG_ and T__T__CAAG__ and G__T__CAAG__ and T__C__CAAG__ in (2)
and conjunction count of 8 in (3)
(*9) Uniform length text patterns , self-comparing , binary count , from (1)
(*10) Text sequencing , from (1) , (2) and (3)

(eg) Sequence resulting from a string search having __GT__C____T and AAT_____CG__ and A____TT____G and __GT____TT__ from
gi|568802162|ref|NT_187285.1| Homo sapiens chromosome 19 genomic scaffold, GRCh38 Primary Assembly CEN19_5 and resulting Blast taxonomy
Ref : Key-string Algorithm - Novel approach to computational analysis of repetitive sequences in human centromeric DNA
Ref : Curr Genomics. Apr 2007; 8(2): 93-111.
Consensus Higher Order Repeats and Frequency of String Distributions in Human Genome


- Sequence(s) from (table) and resulting Blast taxonomy

- Sequences 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 .

- Sequence from gi|354792485|gb|JN956986.1| Mus musculus targeted KO-first, conditional ready, lacZ-tagged mutant allele Pfkfb2:tm1a(KOMP)Wtsi; transgenic
taxonomy
Ref : Nucleic Acids Res. 2013 January; 41(D1): D666-D675.
The Standard European Vector Architecture (SEVA): a coherent platform for the analysis and deployment of complex prokaryotic phenotypes

- Sequence from gb|AC007262.4|:95491-95650 Homo sapiens chromosome 14 clone containing gene for thyroid stimulating hormone receptor, partial CDS, complete sequence and Blast taxonomy

ATGAAAATTTATTCTTGGCTGGGTGCAGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCAAGGCA
GGTGGATTACTTGAGGTCAGGAGTTTGAGACCAGCCTGGTCAACATGGTGAAACCCCGTCTCTACTGAAA
AAACAAAAATTAGCTGGGTG

Ref : BMC Genomics. 2009; 10: 269.
Definition, conservation and epigenetics of housekeeping and tissue-enriched genes


- Sequence derived from gi|21212887|emb|AL034410.9| Human DNA sequence from clone RP4-774G10 on chromosome Xp11.23-11.3 and Blast taxonomy

- Sequence derived from gi|21212887|emb|AL034410.9| Human DNA sequence from clone RP4-774G10 on chromosome Xp11.23-11.3

ATCGTCTGAAGCCTTCTTCTCTCAATTCGTCAAAGTCATTCTCCGTCCAGCTTTGTTCCGTTGCTGGTGA
ACAGGGACATTTAAGTCTGCAGAGGTTTCTGCTGCCTTTTGTTTGGCTGTGCCCTCCCTGCAGAGGTGGA
CTCAGACTGCTGTGCTAGCAATGAGCGAGGCTCCGTGGACATAGGACCCTCCGAGCCAGGCACGGGATAT
TTGAACCCGGTACCTCAGTTGGAAATGCAGAAATCAGCCATCTTCTGCATCGCTCATGCTGGGAGCTGTA
TCCATGGTAACAGCTTCCTCCTCTTTATCGACAGTACTTGTCTGTGACAATTTGTTTTCTATTTCCAAGG
ATTTTAGTATTTTGGACATAATAGTGAGGTTCACGCCCATGGCCAGGTTGCAGTCACAGCGGTAAGTGTC
AAAGCCCTCAGACGGCAGGGTGAACTGCAGCAAGGAGACGTGGACGAATCCATGCTCTGCAGGTTCACGC
CGCTCGAGCTGATGTCCCAGCAGGCCTCGTTGATGAGGTCCTTGAGTGCCTCCAACACCTTCTTCAGGAT
GGAGCCCTGGACCAGGGGCACCTCGAACATGGTGGCAGAGTGGCAACAACGCGGCTACAGGCGGGCGGAA
AGTAGGAAAGTCTAGCCAGTTTTGGCTTCACGAGCCTCAGAGCAAGCAGGCGAATGCCATAGGGAGAGGC
AAGACACTGTGGCCAGTCCTCAAGGATCTAGAACCAGAAATACCATTTGACCTGCCAGTCCGATTACGGG
AAAACATGGCTTATATTGCTACTTTTTGCTTGCGAAGTTTTAAACATTATCCATTGATTTCCCAAAACGG
ATTCTTAAATCATTTCATCCATGACTTCTGCGCTATTTTCTTTGTTCTCTGCATCTCTTTGTCTTTCCTT
Figure (.) , taxonomy , Figure (..) , Figure (...) , taxonomy and taxonomy
Ref : 2013 Sep;12(9):691-8. doi: 10.1016/j.dnarep.2013.05.001. Epub 2013 May 31.
Role of PCNA and TLS polymerases in D-loop extension during homologous recombination in humans
Ref : 2012 Aug;11(10):771-5. doi: 10.1016/j.autrev.2012.02.012. Epub 2012 Feb 22.
The clinical significance of autoantibodies to the proliferating cell nuclear antigen (PCNA)


- Sequence derived from gi|224515582:1-2300536 Homo sapiens chromosome 6 genomic contig, GRCh37.p13 alternate locus group ALT_REF_LOCI_7
taxonomy , Figure (.) and taxonomy

- Sequence derived from gi|224515582:1-2300536 Homo sapiens chromosome 6 genomic contig, GRCh37.p13 alternate locus group ALT_REF_LOCI_7 taxonomy , Figure (.)

- Sequence derived from gi|37574721|ref|NT_077812.2| Homo sapiens chromosome 19 genomic contig, GRCh37.p13 Primary Assembly taxonomy , Figure (.)
Ref : Proc Natl Acad Sci U S A. 2013 Jun 11;110(24):9868-72. doi: 10.1073/pnas.1307864110. Epub 2013 May 22.
General mechanism for modulating immunoglobulin effector function

Ref : Proc Natl Acad Sci U S A. 1992 September 1; 89(17): 8356-8360
Sequence and expression of a membrane-associated C-type lectin that exhibits CD4-independent binding of human immunodeficiency virus envelope glycoprotein gp120


- Sequence derived from gi|37574721|ref|NT_077812.2| Homo sapiens chromosome 19 genomic contig, GRCh37.p13 Primary Assembly taxonomy , Figure (.) and protein taxonomy

- Sequence (string template _ATTT_GG_) and resulting Blast taxonomy , Figure (.) .

- Sequence : TACAAAAAAATTAGCTGGGCATGGTGG and resulting Blast taxonomy
- Sequence (string template _GGCATGG_) having TACAAAAAAATTAGCTGGGCATGGTGG and resulting Blast taxonomy

- Sequence (string templates _ATTCTCCTGCCTCAG____CC__GT) and resulting Blast taxonomy

- Sequence : TTTTATCTACTTTTGGTCTTTGATGATGGTGATGTACAGATGGGTTTTTGGTGTGGATGTCCTTTCTGTT and resulting Blast taxonomy

- Sequence(s) from (table) and resulting Blast taxonomy , Figure (.)
Ref : PLoS Genet. Jan 2014; 10(1): e1004139.
Large Inverted Duplications in the Human Genome Form via a Fold-Back Mechanism


- Sequence derived from gi|528475381|ref|NW_004929326.1| Homo sapiens chromosome 6 genomic scaffold, alternate assembly CHM1_1.1, whole genome shotgun sequence and Blast taxonomy


- Sequence derived from gi|528475381|ref|NW_004929326.1| Homo sapiens chromosome 6 genomic scaffold, alternate assembly CHM1_1.1, whole genome shotgun sequence , (having _GAGGTGGGTGGATCAC_ ) and Blast taxonomy
Ref : Genes & Dev. 2009. 23: 1714-1736
The pathophysiology of mitochondrial disease as modeled in the mouse





To Borce Dzinleski




These routines were written by Dzinleski Jasenko jasenko@unet.com.mk who is the author of C/C++ based routines for encryption/decryption, large numbers operations, the 123SQL database engine and the simplified mariaBasic interpreter which are undergoing projects . This project is self-financing and any contributions are welcomed .
This site resulted in years long support from Borce and Dusica Dzinleski and is devoted to them and especially to my daughter Maria Dzinleska and the faith from Nada Popstefanova .The author is currently seeking for a developers job and this is his cv.

References(1)
GINN AND COMPANY
A Xerox Company
Waltham Massachussets - Toronto - London

TOPICS IN ALGEBRA

I.N.HERSTEIN


Lemma 2.22 Every permutation is the product of its cycles .


References(2)
THE MERIT FACTOR PROBLEM

PETER BORWEIN, RON FERGUSON, AND JOSHUA KNAUER

Abstract

The merit factor problem is of considerable practical interest to communications engineers and theoretical interest to number theorists. For binary sequences, although it is generally believed that the merit factor is bounded, it still has not been completely established that the number of even length Barker sequences, each with merit factor N, is bounded. In this paper, we present an overview of the problem and results of quite extensive searches we have conducted in lengths up to slightly beyond 200.

The merit factor, F, of the sequence relates energy in the sidelobes to energy in the main lobe, F = N2 / 2E

References(3)
Discrete Applied Mathematics 155 (2007) 831 ? 839
Binary templates for comma-free DNA codes from Oliver D. King , Philippe Gaborit :

Abstract
Arita and Kobayashi proposed a method for constructing comma-free DNA codes using binary templates, and showed that the
separation d of any such binary template of length n satisfies d>=n/2. Kobayashi, Kondo and Arita later produced an infinite family
of binary templates with d>=11n/30. Here we demonstrate the existence of an infinite family of binary templates with d>=n/2 -

(18n loge n)1/2.We also give an explicit construction for an infinite family of binary templates with d>=n/2 - 19n1/2 loge n.

2006 Elsevier B.V. All rights reserved.

References(4)
In DNA Sequence Design Using Template Omsha,Ltd.2002 , By Masanori Arita and Satoshi Kobayashi , published by SpringerLink (15 Febrruary 2002) , and - This paper has been selected to receive the New Generation Computing Award for Distingushed Papers - the authors state :

Abstract
Sequence design is a crucial problem in information-based biotechnology such as DNA-based computation . We introduce a simple strategy named template method that sistematically generate a set of sequences of length l such that any of its member will have approximatively 1/3 mismatches with other sequences , their complements , and the overlaps of their concatenations .

References(5)
Proc. of The Fifth Int. Workshop on Frontiers in Evolutionary Algorithms (FEA 2003) under JCIS 2003 Cary, NC, USA, September 26-30, 2003
Reliable Cost Predictions for Finding Optimal Solutions to
LABS Problem: Evolutionary and Alternative Algorithms

Franc Brglez1, Xiao Yu Li1, Matthias F. Stallmann1 and Burkhard Militzer2
1Computer Science Department, NC State University, Raleigh, NC 27695, USA
2Lawrence Livermore National Laboratory, Livermore, CA 94550, USA


Abstract

The low-autocorrelation binary sequence (LABS) problem represents a major challenge to all search algorithms, with the evolutionary algorithms claiming the best results so far. However, the termination criteria for these types of stochastic algorithms are not well-defined and no claims have been made about optimality. Our approach to find the optima of the LABS problem is based on (1) experiments with problem sizes for which optimal solutions are known, (2) an asymptotic analysis of statistics generated by such experiments, (3) reliable predictions of the cost required to find optimal solutions for larger problem sizes. The proposed methodology provides a well-defined termination criterion for evolutionary and alternative search algorithms alike.



IMPLEMENTATION

  • Bit Parity Compression
  • This is a tiny portable compression based on data 24-bit word bit parity distances .

    Bit parity complements code where n=1-65536 , following bit-parity table was produced by 16-bit permutations of 12 48 60 192 204 240 252 768 780 816 828 960 972 1008 1020 and 3 .



    16-bit parity table

    21337 0101001101011001
    21341 0101001101011101
    23369 0101101101001001
    28952 0111000100011000
    29021 0111000101011101
    29465 0111001100011001
    29517 0111001101001101
    29960 0111010100001000
    30045 0111010101011101
    30477 0111011100001101
    31517 0111101100011101
    31561 0111101101001001
    32012 0111110100001100
    32029 0111110100011101
    32092 0111110101011100


    2-byte coding based on this table in 16_table_coding_min_max__ and practical WAV scrambling in wavscr_16_16b_min_max_.
    The 240 entries 16-bit parities table and any file 2-byte data vs 240 16-bit parity table treshold bit entries generator code .


    Illustration of and a generator code for data coding using bit parity table.
    Bellow 16bit , 8KHz WAV file transformation into a white noise WAV using 240 bit parity table , upper chart transformed values , lower chart transformed values merits ranging from -1.95496 to 6.35886 .


    Merits chart from simple min-max in 8KHz WAV file transformation into a white noise WAV using 240 bit parity table merits ranging from -1.66785 to 10.39834 .


    The 128 entries 16-bit parities table and
    the 131 entries 24-bit parities table
    followed by a binary distances permutation chart , 2 and 4 bit complementary binary distances : 0 3 48 51 .


    Bellow 1152 24bit table coding (partial) chart .



    24-bit parity table

    16676387 0 111111100111011000100011
    16676386 1 111111100111011000100010
    16675364 2 111111100111001000100100
    16675360 3 111111100111001000100000
    16413730 4 111110100111010000100010
    16412705 5 111110100111000000100001
    16412704 6 111110100111000000100000
    16397315 7 111110100011010000000011
    16397314 8 111110100011010000000010
    16396289 9 111110100011000000000001
    16396288 10 111110100011000000000000
    16156144 11 111101101000010111110000
    15625775 12 111011100110111000101111
    15625770 13 111011100110111000101010
    15624749 14 111011100110101000101101
    15624744 15 111011100110101000101000
    15362088 16 111010100110100000101000
    15346699 17 111010100010110000001011
    12473890 18 101111100101011000100010
    12472865 19 101111100101001000100001
    12472864 20 101111100101001000100000
    12456453 21 101111100001001000000101
    12456452 22 101111100001001000000100
    12210213 23 101110100101000000100101
    12194822 24 101110100001010000000110
    12194818 25 101110100001010000000010
    12193793 26 101110100001000000000001
    12193792 27 101110100001000000000000
    11806144 28 101101000010010111000000
    11422248 29 101011100100101000101000
    11406922 30 101011100000111001001010
    11405832 31 101011100000101000001000
    11160618 32 101010100100110000101010
    11143180 33 101010100000100000001100
    11143177 34 101010100000100000001001

    Bellow 24bit , 8KHz WAV file transformation into a white noise WAV using 24-bit bit parity table , upper chart transformed values , lower chart transformed values merits ranging from -2.65405 to 4.57071 .




    07.10.2012 VRM 1.0.0 Download File bit_parity_compression_73_.zip
    14.10.2012 VRM 1.1.0 Download File bit_parity_compression_73__.zip
    18.02.2013 VRM 1.1.1 Download File bit_parity_compression_73_111.zip
    05.05.2013 VRM 1.1.1 Download File bit_parity_compression_73_111_.zip

    18.02.2013 VRM 1.7.5 Download files packaging utility mls_175.zip
    05.05.2013 VRM 1.7.7 Download files packaging utility mls_177.zip



  • RIFF(WAV) COMPRESSION (principia example)
  • This is a binary compression implementation on PDA recording device output file (eg 16 bit, 8000 Hz, 128 kbps Wav). This functional example performs looseless wav compression on PDA recording file up to 350 sec.

  • BYTE ORDER COMPRESSION
  • mdiff and boc routines byte order compression utilization via (re) occuring 2-byte order pairs indexing . This routine processes each data byte and next data byte pair (s) through this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask . Thus 2-byte (s) order (sequence) is truncated into a single byte field . Two byte resultant field (s) consisting of byte 1, byte 2 and byte 2, byte 3 for a single 16-bit dictionary entry . Thus compressed data results in ( (re) occuring entries) index number order . Index entries are bit truncated when written to the compressed data output file . Compression dictionary produced while compressing typical ASCII data file eg source code or HTML code is relatively small and average compression gains in such files are good .

    12.09.2008 VRM 1.1.0 Download File e1173.zip

  • 6-bit BINARY COMPRESSION
  • This is a fast compression routine based on a 4-byte data word translation via 6-bit bit parity table . 6 bit table entries were computed by ommitting last two bits (having decimal values 1 and 2) in a single byte (0 to 255) giving 64 index entries combination (s) per data byte . Also all truncation entries from a 4-byte data word have only 256 combination (s) . The 4 6-bit (truncated) dictionary entries are (re) indexed and writen to a compressed data output file in a truncated index number order having (binary) length (s) from 7 to 18 bits compared to the original data 32 bit (word) length . Useful for compressing large document data files (binary and text data documents) . Yet compression gain in ASCII data files is obvious due to short binary length of index number entries .


    30.03.2010 VRM 3.5.1 Download File mar6_351_1.zip
    08.10.2012 VRM 3.6.0 Download File mar6_360_1.zip

  • BINARY COMPRESSION 79
  • This a binary compression based on 2-byte data word right (low) bit truncation dependent on max of dictionary (re) occurences for each data buffer . Thus such (truncated) dictionary entries represent most of input data buffer . Dictionary entries are written to the compressed data file in a left truncating manner with the leftmost significant bit ommited , having 7 to 16 bits in length . This method performs fast and efficient data storage . This routine performs principia used in the bellow listed methods .

    14.08.2010 VRM 3.1.3 Download File mar79_313_1.zip
    18.08.2010 VRM 4.0.1 Download File mar79_401_1.zip
    20.08.2010 VRM 4.1.1 Download File mar79_411_1.zip
    06.02.2011 VRM 5.0.1 Download File mar79_501_2.zip
    16.02.2011 VRM 5.1.0 Download File mar79_510_2.zip
    12.06.2011 VRM 5.3.0 Download File mar79_530_1.zip
    29.06.2011 VRM 5.3.1 Download File mar79_531_1.zip
    15.06.2011 VRM 5.4.0 Download File mar79_540_1.zip
    25.06.2011 VRM 5.4.1 Download File mar79_541_1.zip
    20.12.2012 VRM 5.4.3 Download File mar79_543_1.zip
    20.12.2012 VRM 5.3.3 Download File mar79_533_1.zip

    mls_17 is a file packager usefull in conjunction with mar79_510_1
    To create a filesystems subtree archive : mls_17 -c (source) drive:\path...\
    To restore an mls_A11.mar archive : mls_17 -r (target) drive:\path...\

    03.06.2012 VRM 1.5.0 Download File mls_15.zip
    31.07.2012 VRM 1.7.0 Download File mls_17.zip
    20.12.2012 VRM 1.7.3 Download files packaging utility mls_173.zip
    18.02.2013 VRM 1.7.5 Download files packaging utility mls_175.zip

  • BINARY TEXT COMPRESSION
  • This is a fast and efficient compression example that executes fast input data indexing and dictionary occurrence search based on binary 4x4-bit long data samples . Indexed sequences are checked vs variable data length buffer .
    Thus this compression method gains speed concerning strict 4x4(16) - bit long dictionary patterns . This routine is subject of further development .

    04.09.2007 VRM 1.3.3 Download File mar9.zip

  • BINARY COMPRESSION ROUTINE
  • Binary compression methods are widely used in communications, data storage and numeric analysis. Exploring genetic complexity numeric sequences employ such methods. Some of them are presented on this site together with a command-line Win32 implementation (s) that demonstrates the capability of compression of large ASCII data files and binary files and also slightly modified in numeric data sequence analysis. This binary compression method is based on numeric sequence generated by the following binary formula as presented by the C/C++ syntax: #define op_7(x,y)(((x+y)^y)|(((x&y)!=0)?(x&y)/y:0)) . This numeric sequence represents all numbers from 0-255(8-bit) for 0-127(7-bit) arguments in an x-y matrix manner . When always x=y and x:0-127 it results in all 8-bit odd numbers . When applied on a 2-byte data sequence it results in 14 or less bits long index . Combined together with one 1-bit substract indicator it will allow compression . Using these arguments as dictionary entries coded by hi/lo/length indicators whose reocurring indexes are stored instead of the input data allows gain of an average 30% compression in large ASCII text files . This numeric sequence formula was generated by another routine written for the purpose of exploring numeric sequences generation .This is an compression Win32 command-line tool based on binary compression . This example states the speed and efficiency of this static large ASCII files compression method .

    04.09.2007 VRM 1.3.3 Download File mar.zip

  • BINARY COMPRESSION 77
  • This is a binary compression based on 2-byte long data binary shifting concatenation (bit density increase) into dictionary entries that are left truncated (common in ASCII data text files) . Compression gain depends on data redundancy in an inverse meaning . The larger the enthropy compression will increase .

    04.06.2008 VRM 1.1.0 Download File mar77.zip

  • BINARY FACTOR GROUPING COMPRESSION ROUTINE
  • This compression example uses binary pattern indexing by 2-byte sequence bit truncation from 16-12 bits in order to gain max of dictionary occurrence . This compression method is a compression gain vs unoptimized compression speed compromise .
    This example states the correctness of the genetic text complexity display routine since its dictionary covers most of the numeric sequences occurrence . Yet this compression example is subject of further development .

    21.09.2007 VRM 1.4.0 Download File mar73.zip

  • ASCII TEXT FILE FAST SORT/INDEXING Routine
  • This is a fast sorting/indexing example that builds a file position sorting tree as a result of n-depth text file line byte sorting . The sorted sequence tree may expand to further depth levels , this routine uses default depth 6. It exibits fast sorting of a text file up to the size 100K lines/rows .
    E.g.: C:\msort3 -f "War and Peace NT.txt"

    30.10.2007 VRM 1.3.1 Download File msort3.zip

    21.01.2014 VRM 1.4.0 Download File msort3_140.zip

    07.01.2014 VRM 1.3.1 Download File msort4_131.zip

  • MCALC Simple Calc routine
  • This is a simple CALC screen routine .
    The -d2 or -d4 or -d6 command line switch stands for number of decimal places . Keyboard input exit char is q and reset char is c .

    09.10.2009 VRM 1.0.1 Download File mcalc.zip

  • MDUMP ASCII Text File Sequence Redundance Dump
  • This is a ASCII text file dump method that finds text data sequences inside a ASCII text file . Usefull for creating text file data sentence definition (s) .

    18.09.2009 VRM 2.0.1 Download File mdump3.zip

  • MDADD STRING DATE ADD
  • This is a string date add routine that computes add of start date with increment in days, months and (or) years . Comand line switches requires the start date string, increment string input together with a date formating string eg mdadd 20081008 00010000 YYYYMMDD (for adding 1 year) .

    16.11.2008 VRM 1.1.1 Download File mdadd.zip

  • MDDIFF STRING DATE DIFFERENCE
  • This is a string date difference routine that computes difference between two date strings in days, months and years . Comand line switches require two date (s) string input together with a date formating string eg mddiff 19591117 20081008 YYYYMMDD .

    08.10.2008 VRM 1.1.0 Download File mddiff.zip

  • MDIFF FILE COMPARE
  • This is a file compare routine based on a bit parity comparisson of 2-byte sequences . Hashing was built on sequenced use of this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask , examined via byte sequence group (s) count (file 1 vs file 2) compare . Command line requires file 1 name and file 2 name resulting in fast comparrison result message . The -d (detail) switch displays all difference (lines) . Useful for file compare and change tracking in document and source code files .

    26.05.2010 VRM 3.0.1 Download File mdiff_301_1.zip

  • BIT PARITY BYTE ORDER FILE CHECKSUM
  • This is a file fingerprint routine that outputs cheksum number (s) file . Hashing was built on sequenced use of this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask . Sequenced number treshold count was computed with comarisson of original byte (bit parity) result vs generated result . Number point (s) were computed inside a 1024 byte file buffer and stored (XOR op) inside a 512 number sequence consequently . The output fingerprint file numbers state data order and integrity eg when compared vs same file (copy from restore or data transfer) cheksum (s). Command line requires input filename and cheksum output filename . Usefull for building cheksum list (s) of important data archive (s) .

    25.03.2009 VRM 1.3.0 Download File bp_boc3.zip

    Bit parity tools at Brothersoft
    http://www.brothersoft.com/bit-parity-tools-169821.html

    09.01.2013 VRM 5.2.0
    Download msearch , mgrep , mdiff , boc , mar79 and mls in bp_tools_520_1.zip
    10.05.2013 VRM 5.3.1
    Download msearch , mgrep , mdiff , boc , mar79 and mls in bp_tools_531_1.zip


    THE RANDOM KEYS DISTRIBUTION ENCRYPTION ROUTINE

    This is a strong encryption method based on a 4 number keys random number distribution . The 4 (5 - digit) number keys provide strong encryption protection due to message hashing that is provided on random number (s) generation where the inputed keys are used as random seeds . Each user choosen key is randomized and message hash is produced with a different randomizing method . Execution requires usage of the following command line switches:

    eg r71 -a 11111 -b 22222 -c 33333 -d 44444 -e < filename to encrypt>
    and to decrypt eg r71 -a 11111 -b 22222 -c 33333 -d 44444 -f < filename to decrypt>


    where the numbers following the -a -b -c and -d switches are user chosen encryption 5 digit number keys.

    1.1(min)...
    minmv=999;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1]){n=rs[l][1];}else{


    if(minmv>rs[l][2]){minmv=rs[l][2];minl=l;}
    n=0;

    }

    }
    if(df){printf(" %d",rs[minl][0]%outm);}
    htable[hti_dmin][0]=rs[minl][0]%outm;++hti_dmin;
    ...

    1.2(max)...
    maxmv=0;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1]){n=rs[l][1];}else{


    if(maxmv<rs[l][2]){maxmv=rs[l][2];maxl=l;}
    n=0;

    }

    }
    if(df){printf(" %d",rs[maxl][0]%outm);}
    htable[hti_dmax][1]=rs[maxl][0]%outm;++hti_dmax;
    ...

    2.1(min)...
    minmv=999;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1])
    {


    n=rs[l][1];
    if(minmv>rs[l][2]){minmv=rs[l][2];minl=l;}

    }else{n=0;}

    }
    if(df){printf(" %d",rs[minl][0]%outm);}
    htable[hti_rmin][3]=rs[minl][0]%outm;++hti_rmin;
    ...

    2.2(max)...
    maxmv=0;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1])
    {


    n=rs[l][1];
    if(maxmv<rs[l][2]){maxmv=rs[l][2];maxl=l;}

    }else{n=0;}

    }
    if(df){printf(" %d",rs[maxl][0]%outm);}
    htable[hti_rmax][3]=rs[maxl][0]%outm;++hti_rmax;
    ...

    (1) Each of the entered key numbers resultant distribution series (3-133)*(3-7) according to these criteria are written in a 4 column table
    (2) Each table is hashed according the bellow listed binary criteria
    (3) The 4 resulting tables are then re-hashed using the same binary criteria.

    #define op_A(w,x,y,z)(((((w&0x0000ffff)<<16)|x)&0xffff0000)|((((y&0x0000ffff)<<16)|z)&0x0000ffff))
    #define op_B(w,x,y,z)(((((x&0x0000ffff)<<16)|w)&0xffff0000)|((((z&0x0000ffff)<<16)|y)&0x0000ffff))
    #define op_E(w,x,y,z)(op_A(w,x,y,z)>op_B(w,x,y,z)?op_A(w,x,y,z):op_B(w,x,y,z))

    One out of the 4 functions running inside this encryption was used in the Game of life using 100x100 cells that outputs the generations data in a graphics BMP file format .
    Download a game of life VRM 1.3.1 at 17.07.2007 and it states the diversity of random number distributions produced .











    Try looping this encryption in the following way:

    Step 1.C:\r7 -a <key1 number> -b <key2 number> -c <key3 number> -d <key4 number> -e "filename.txt"
    Step 2.C:\r7 -a <key5 number> -b <key6 number> -c <key7 number> -d <key8 number> -e "previous_output.mar"
    ...
    ...
    Step n.

    and repeat it in the same manner n times until the desired security level is gained .

    18.12.2007 VRM 1.3.3 Download File r7.zip

    MARIAHASH THE ENCRYPTION ROUTINE

    This is a fast encryption routine using proprietary hashing method. Cypher strength depends on a large hashing number and password length . password text must be entered in a password.txt file and should have between 50 and 100 char(s) .This routine was written by the authors wish to try to improve message privacy while sent across the networks .

    09.06.2007 VRM 1.3.0 Download File 79923.zip

    THE 123SQL DATABASE ENGINE

    This is an undergoing project aimed to construct a small portable SQL database engine for PDA's, and this is a functional browsing engine that contains data and sample browsing statements . Data may be imported together with table/column creation . Typically the source data may be spreadsheet column TAB delimited export data . Database/table/column creation may be viewed in the included code following the -c switch . Table names and column names and field byte sizes should be specified, but column/field lengths my also vary in size row by row . The engine performs SQL keyword/syntax checking using the syntax/keywords list files included . Object names check and object attributes read is performed in the system database files named 123SQL_db_1.mar and 123SQL_db_2.mar . Database structure allows multiple object browsing . The sorting/searching routines require low machine resources thus meeting most modern PDA specifications and their sources were also published under different names .
    This project was founded on the authors' unique relational database engine structure design . The 123SQL engine requires the following command line syntax:

    E.g.: C:\910791 -d "Sample"
    for attaching and browsing the included database, where Sample is the database name included . When

    E.g.: C:\910791 -c "import_data_file.txt"
    the engine will create a database table and table columns as specified in the included create.txt syntax and import the data from the file name specified after the -c switch . Number of column definitions and TAB delimited fields must match, if specified column length is greater than data length space justification will occur . Supported SQL like data browsing syntax is :

    {select}

    {*|column_name|column_name_1,...column_name_n}

    {from}

    {table_name|table_name_1,...table_name_n}

    [where

    |[column_name=string_litteral|column_name>string_litteral|column_name<string_litteral]

    |[column_name>string_litteral and column_name<string_litteral]

    |[column_name[>|<]string_litteral and column_name=string_litteral]

    |[column_name=string_litteral or column_name=string_litteral or column_name=string_litteral]

    |[column_name>string_litteral and column_name<string_litteral and column_name=string_litteral]

    ]



    123SQL 15.04.2008 VRM 1.5.0 Download file 123SQL.zip




    The MariaBasic Interpreter


    The Maria Basic Interpreter is a command-line programming tool - interpreter aimed to help PDA users code formula/calculations, string and file procedures that execute on their handhelds.
    The included source code may easily (re) compile for various OS/CPU architectures , since it was written in ISO/ANSI C and requres moderate machine resources .
    Interpreter design allows fast execution of basic syntax like procedures with calculations and file/string operations. Its simplified syntax allows basic programming skills and may be used for learning , but may expand to execution of rather complex routines .
    This interpreter allows basic like (simplified) syntax commands like nesting, statement loops, and conditional execution. The ZIP archive ready for download includes a few txt files which are sample basic syntax supported nesting and file/string function example (s) .
    Source procedures may execute with a command line stating: E.g.: C:\mariaBasic -edayofweek.txt .
    The decimal to binary , day of week , day of year , bubble sort example sources contain code structure necessary to supply for program execution .
    Supported code syntax :

                               MariaBASIC 3.1.1.3 Coding Structure:


    1. Coding convention (s)

       1.1.Declarations:

       <varname> is a <string literal> + <type declaration> = <initial value>

       where
           <string litteral> = {[_]|[a-z]|[A-Z]|[0-9]}
           <type declaration> = {[%]|[&]|[#]|[$]}

               where % stands for integer data type
               where & stands for long integer data type
               where # stands for double data type
               where $ stands for char data type with <=256 bytes
               
           <initial value> =
           {
               [<string constant> is a single quoted literal having [a-z]|[A-Z]|[0-9]]
               |
               [<num constant> is a number literal having [0-9]|[.]]
           }

       logical expression operators are [and]|[or]|[xor]
       conditional expression operators are [>]|[<]|[=]|[>=]|[<=]|[<>]

       1.2.Program body: Declaration(s) | Statement(s) | Logical expression(s) | Simple Block Statement(s) | Nested Statement(s) | End statement

           1.2.1.Statement:

           varname[%|&|#]=[[varname[%|&|#]|[<num constant>]][^,*,/,+,-][[varname[%|&|#]|[<num constant>]]
           varname$=varname$+varname$
           varname[%|&]=len$(varname$)
           varname[%|&|#]=val$(varname$)
           varname$=trim$(varname$)
           varname$=left$(varname$,<num constant>)
           varname$=right$( varname$,<num constant>)
           varname$=mid$( varname$,<num constant>,<num constant>)
           varname$=format$(varname+{[%|&|#]},<string constant>)
           varname[%|&|#]=round$(varname#)
           open varname$ for [input]|[output] as #<num constant>
           input #<num constant>, varname$
           print #<num constant>, [<string constant>| varname[%|&|#|$][,]][;]
           close #<num constant>
           print [<string constant>| varname[%|&|#|$][,]][;]

           1.2.2.Logical expression:

           varname[%|&]=(varname[%|&|#|$][=,<>,>,<,>=,<=]varname[%|&|#|$]
               [and]|[or]|[xor]
               [ varname[%|&|#|$][=,<>,>,<,>=,<=]varname[%|&|#|$])

           1.2.3.Simple Block Statement:

           {if (<conditional expression>) then}
               <statement(s)>
           {end if}
           {while (<conditional expression>)}
               <statement(s)>
           {wend}
           {for varname[%|&]=[[<num constant>]| varname[%|&]] to [<num constant>| varname[%|&]]}
               <statement(s)>
           {next varname[%|&]}

           1.2.4.Nested Statement:

           {if (<conditional expression>) then}
               <statement(s)>
               <simple block statement(s)>
           {end if}
           {while (<conditional expression>)}
               <statement(s)>
               <simple block statement(s)>
           {wend}
           {for varname[%|&]=[[<num constant>]| varname[%|&]] to [<num constant>| varname[%|&]]}
               <statement(s)>
               <simple block statement(s)>
           {next varname[%|&]}

           1.2.5.Comment(s):
           rem <string constant>

       1.3. End Statement:
       {end}

    Maria BASIC source code

    MariaBASIC for Pocket PC 09.03.2014 VRM 3.3.1.3
    MariaBASIC 09.03.2014 VRM 3.3.1.3

    MariaBASIC for Pocket PC 07.05.2014 VRM 3.3.1.5
    MariaBASIC 06.05.2014 VRM 3.3.1.5

    MariaBasic at Brothersoft.com
    Windows 7 Download - Editor's Pick MariaBasic at Windows7Download.com
    Download Typhoon - Editor's PickMaria Basic Interpreter for Pocket PC has been reviewed by Download Typhoon and got "Editor's Pick" award:
    mariaBASIC DownloadYour software has received 5 stars award from Soft-Go.com editors team, based on price, usability, documentation and support

    MariaBASIC Number Permutation Cycle Function (ASCII Text Rhymes) output having all 1(s),2(s),...9(s) as input and its C/C++ code (!) in num_c_perm.cpp .

    The Eleven Comedies , an english translation of Aristophanes et al Comedies (from Project Gutenberg eBook) Part 1 , chart 1
    and chart 2 from Part 2 ,
    War and Peace , an english translation of Tolstoy (from Project Gutenberg eBook) permutation chart ,
    The Notebooks of Leonardo Da Vinci , an english translation of Leonardo Da Vinci (from Project Gutenberg eBook) permutation chart generated by num_c_perm_2.cpp .


    THE FAST (ASCII and Unicode) TEXT FILES SEARCH ROUTINE

    This is a fast text search routine that allows single (or quoted composite) string search throughout an ASCII or Unicode text (text containing) file(s) . Unicode search will also allow strings containing mixtures of different Unicode table(s).
    E.g.:
    1. (ASCII search) msearch3 <ASCII_input_filname.txt> <search_string>
    2. (Unicode search) msearch3 <Unicode_input_filname.txt>
    (search string in Unicode file uarg.txt and search results in Unicode file ures.txt)


    03.07.2008 VRM 1.1.1 Download File msearch3.zip

    THE FAST ASCII TEXT FILES SEARCH ROUTINE

    This is a fast text search routine that allows multi string (up to 10 search strings containing one or more words within) search throughout an ASCII text file . So, each search string (quoted) may have one or more words. The -s switch allows any match, while the -e switch allows only exact match.
    E.g.: C:\msearch -s(-e) "package install"+"media"+"component" -f "FreeBSD Handbook.htm"
    E.g.: C:\msearch -s(-e) "network devices installation" -f "FreeBSD Handbook.htm"
    E.g.:C:\msearch -s(-e) "trodes in his hands" -f "book_sd.txt"
    E.g.:C:\msearch -s(-e) "Bezukhov and Natasha"+"Buonaparte Napoleon"+"Pierre" -f "War_and_Peace_NT.txt"
    The program output will display all results along with their line number file positions, the unique and composite sentence search phrase matches together with their total occurrence count.

    15.04.2008 VRM 1.3.3 Download File msearch.zip
    09.04.2010 VRM 1.4.1 Download File msearch_141_1.zip

    mSearch4 Single sentence , single file :
    25.04.2012 VRM 1.0.3 Download File mSearch4_103.zip
    26.07.2012 VRM 1.2.1 Download File mSearch4_121.zip

    mSearch4 Single sentence , file folders tree walk :
    18.02.2013 VRM 1.3.1 Download File mGrep4_131.zip
    10.05.2013 VRM 1.3.5 Download File mGrep4_135.zip


    mSearchSen(tence) 4 , 1-10 search strings divided by logical conjunction and inclusion symbols
    - Example , single search sentence : mSearchSen4_130 -s"Mucius&Scaevola" -f"War_and_Peace_NT.txt"
    - Example , search two sentence(s) in conjunction : mSearchSen4_130 -s"Mucius&Scaevola&burned" -f"War_and_Peace_NT.txt"
    - Example , search two sentence(s) with inclusion : mSearchSen4_130 -s"Scaevola|burned" -f"War_and_Peace_NT.txt"
    02.02.2014 VRM 1.4.0 Download File mSearchSen4_140.zip
    08.02.2014 VRM 1.4.1 Download File mSearchSen4_141.zip



    THE ASCII TEXT FILES SENTENCE CONTEXT SEARCH ROUTINE

    This is a text file complex search routine that allows text search build on the context - sentence words concerning a given subject . This search allows automated search criteria build depending on sentence words contents and user choice . Sentence words files and their sentence links are built during the indexing phase for a given text file . After indexing, the routine will display all sentences for a chosen sentence subject (as enlisted in the words file) and allow detailed context search and all sentences display concerning the chosen context .
    For the indexing type:E.g.: C:\dp_13_201_1 -f "War_and_Peace_NT.txt"
    For the context search type:E.g.: C:\msearch_141_d_1 -s(e) "Bagration" -f "War_and_Peace_NT.txt"
    The -s switch enables any match search when d was chosen, and -e switch enables only exact word match. The included files contain the examples book already indexed. Typically the search word is a name, or a subject that is being often described and attributed in the book text . So after viewing/choosing the desired sentence/search combination all text lines containing the chosen words will be displayed . Thus viewing book contents by desired subject details requires smaller amount of time .

    15.04.2008 VRM 1.3.0 Download File r113.zip

    05.09.2013 VRM 2.1.0 Download File text_file_context_search_210_1.zip
    05.09.2013 VRM 3.2.0 Download File text_file_context_search_320.zip

    This package contains :
    (1) the dictionary routine and
    (2) mSearchSen4 routine that allows multile text sentences serches with logical conjunctions and inclusions
    02.02.2014 VRM 3.3.0 Download File text_file_context_search_330.zip
    08.02.2014 VRM 3.3.1 Download File text_file_context_search_331.zip

    THE FONT IMAGE RECOGNITION ROUTINE

    This routine creates a vector shape sequence file (using -i switch) out of an 100x100 pixels 24 bit colour depth black and white image representing a character true type image (font) or character freehand drawing . Then using the -c switch the two index files derived from two different images are compared and graphics match result is displayed .
    For the indexing type:
    E.g.: C:\cr13 -i "Drawing1.bmp" "Drawing1_Index.txt"
    For the comparison of two different index files type:
    E.g.: C:\cr13 -c "Drawing1_Index.txt" "Drawing2_Index.txt"
    At present the routine builds shape vectors on black/white bitmaps, it does not support different resolution nor colors/color depth.
    But how does it work?

    (1) indexing, creates vector txt file (that might be the meta character file) out of the bmp image file in the following manner:
    - inverts the b/w file matrix (the way human eye sees it),
    - searches for quadrants (10x10 pixels sized) with 40/60% b/w ratio, thus finding character image edges (up to 8 pairs in the same row),
    - creates vectors out of each quadrant,
    - shifts quadrants by (only) few pixels UP since bmp edges do not always REALLY represent character ID curves, repeating vector creation ...
    and
    (2) comparison of two vector files:
    - shifts back all X-axis values subtracting them by absolute minX value,
    - computes curve angles out of each quadrant values,
    - computes resultant angles out of quadrant pairs building most real character curves,
    - compares the two vector files angle pairs,
    - computes match statistics .


    This development is aimed for PDA users using easier ways for text input.
    To Maria Dzinleska

    27.04.2007 VRM 1.0.1 Download File cr13.zip

    THE ROUTINE THAT GENERATES THE PRIME NUMBERS KEY PAIR OUT OF THEIR PRODUCT

    These routines were written during and for the www.rsa.com prime key numbers context that requires finding the exact prime numbers key pair out of a very large (256,512...1024... bits long) product number. The routines were written in java and use the BIGINTEGER java class in order to compute the prime key pair .The starting point routine finds a prime numbers key pair with product_number_bit_length/2 bit length that give sufficient accuracy (near as far as possible) to the product number, the more the preciseness the more the computing time to spend . So the loop that computes the suggested starting prime number pair is limited with the corresponding number of equal product-target significant digits . The remaining procedures consequently perform a very long (all 1's and trailing ZEROS) 111 ... *10^N substraction (s) from the suggested key pair measuring the distance (difference) from the target product number by subsequent multiplication checks . At the point of diverging found and at a certain preciseness (number of equal significant digits) a new key pair may be generated through the first routine . Than the process has to be repeated while gaining more and more equal product-target significant digits .

    23.07.2006 Download File Welcome.zip

    How do these computations compute a very similar or near prime key pair out of a large product key?

    Exmining the bellow listed mariBasic code and its (partial) output shows a few number products appearing at large division loop distances and having a 0000 period between decimal remainder values . Testing those (listed) numbers might prove that most of them are prime numbers . Testing large (200 decimal or more) product keys in this way would take indefinite time . So, the WelcomeQ routine uses a substraction operation on a proposed prime key pair . The routine that generates prime key pairs that have a given decimal target product number match is based on a binary field seed number modification basing only on target match numbers as match search loop starting point . The substractor (having the (decimal) value of e.g. 1111111111000000000000000) shifts the 1111111111 period to the right by approoving that this way truncated prime key pair product matches more and more decimals to the target product number . Actually there are sets of prime kepairs obtaining a certain decimal match .Usually it is necessay to switch between different pairs in order to increase the decimal match of the product . And that is the main iteration of this method sometimes requiring examining and rejecting large number of prime key pairs in order to gain one or more decimal match more . Gaining a 100 decimals precisenes on a common PC computer thus would not be hard to achieve . These computations generate prime keys having computable decimal match gain or complete product number match compared to a given huge product number .

    Brief order and explanation of execution steps:

    rem Short multiply factor pair routine
    rem example in MariaBasic 3.2.1.1
    rem April , 26 , 2012


    Varn$='3539572063110071 '
    Varn1$=''
    Varn2$=''
    Varn3$=''
    Varn4$=''

    Vard1#=0
    Vard2#=0
    Vard3#=0
    Vard4#=0
    Vard5#=0
    Vard6#=0

    Vari1%=0
    Vari2%=0
    Vari3%=0
    Vari4%=0
    Vari5%=1

    while (Vari5%<=12)

    Vard1#=val$(Varn$)
    Varn1$=mid$(Varn$,Vari5%,3)
    Vari1%=len$(Varn$)
    Vari2%=len$(Varn1$)
    Vari3%=Vari1%+1-Vari2%-Vari5%
    Vard2#=val$(Varn1$)
    Vard2#=Vard2#+0
    Vard2#=Vard2#*10^Vari3%
    Vard6#=Vard1#/Vard2#


    print '--------------------'
    Varn2$=format$(Vard1#,'000000000000000')
    print Varn2$
    Varn2$=format$(Vard2#,'000000000000000')
    print Varn2$
    Varn2$=format$(Vard6#,'000000000000000')
    print Varn2$
    print '--------------------'

    Vari5%=Vari5%+3

    wend

    print '--------------------'
    print '--------------------'

    Vard11#=0
    Vard12#=0
    Vard13#=0

    Vard21#=0
    Vard22#=0
    Vard23#=0
    Vard24#=0
    Vard25#=0

    Vars11$='9570000000000'
    Vars12$='369'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard21#=val$(Varn2$)

    print '--------------------'
    print '--------------------'

    Vars11$='2060000000'
    Vars12$='1718238'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard22#=val$(Varn2$)

    print '--------------------'
    print '--------------------'

    Vars11$='3110000'
    Vars12$='1138126065'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard23#=val$(Varn2$)


    Vard24#=Vard24#+Vard21#+Vard22#/1000+Vard23#/1000000

    print '--------------------'
    print '--------------------'

    Varn2$=format$(Vard24#,'000000000000000')
    print Varn2$

    Varn2$=format$(Vard1#,'000000000000000')
    print Varn2$

    Vard25#=Vard1#-Vard24#

    print '------difference---'

    Varn2$=format$(Vard25#,'000000000000000')
    print Varn2$

    print '--------------------'

    Dzinleski Jasenko - jasenko@unet.com.mk

    Mailing Address:
    +38922770296
    Dositej Obradovik 15/8
    1000 Skopje Republic of Macedonia


    (1) All published data, executables and sources from this site described above apply to GNU General Public License as published by the Free Software Foundation and can not be used, copied, sold, redistributed or used in any other way but only by written permission by Jasenko Dzinleski . Copyright (C) from 2001 - 2012 and later by Jasenko Dzinleski
    (2) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version , if not opposite to (1) .
    (3) This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE . See the GNU General Public License for more details .
    You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc ., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA .