Search of Periodicity Regions in the Genome A.thaliana - Periodicity Regions in the A.thaliana Genomes

E. V. Korotkov, F. E. Frenkel, M. A. Korotkova


A mathematical method was developed in this study to determine tandem repeats in a DNA sequence. A multiple alignment of periods was calculated by direct optimization of the position-weight matrix (PWM) without using pairwise alignments or searching for similarity between periods. Random PWMs were used to develop a new mathematical algorithm for periodicity search. The developed algorithm was applied to analyze the DNA sequences of A.thaliana genome. 13997 regions having a periodicity with length of 2 to 50 bases were found. The average distance between regions with periodicity is ~9000 nucleotides. A significant portion of the revealed regions have periods consisting of 2 nucleotide, 10-11 nucleotides and periods in the vicinity of 30 nucleotides. No more than ~30% of the periods found were discovered early. The sequences found were collected in a data bank from the website: This study discussed the origin of periodicity with insertions and deletions.


