Font Size: a A A

Study On The8-mers Distributions Of Intergenic Sequences

Posted on:2014-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:S H XuFull Text:PDF
GTID:2250330398496477Subject:Physics
Abstract/Summary:PDF Full Text Request
It is known that the multiple-peak distributions of8-mers with their frequencies are happened in some higher eukaryote intergenic sequences, but mono-peak distributions are appeared in some lower eukaryote. Based on8genomes, we analyzed the8-mer distributions of their intergenic sequences and studied the8-mers characteristics in different peaks. There are three peaks in human, pig, horse and mouse. The8-mers models in Peak1and Peak2are far away from the random distribution and the8-mers models in Peak3are distributed in the region of random distribution. There are only one peak in yeast, C.elegan, A.gambiae and A.mellifera. According to the number of CG dinucleotide contained in each8-mers,8-mers were divided into three classes, as CG0, CG, and CG’2(include more than two CGs). We found that the8-mer distributions in the three-peaks are distinguished clearly in the four higher eukaryotes. All of the CG’28-mers are located in Peak1region and all of the CG,8-mers in Peak2, the CG08-mers are located in Peak3. But when the8-mers were classified by the other15dinucleotides, the three-peaks are not be distinguished. The8-mers distributions of the four lower eukaryotes are also distinguished clearly by CG0, CG1and CG’2classifications. We think that mono-peak or multimodal distributions are caused only by the evolution of intergenic sequences. Evolution of CG contained8-mer number is conservative relatively and CG0contained8-mer numbers increase raptly in evolution process of higher eukaryote. It is the reason to cause the separation of the three8-mer distributions. Our results show that CG8-mers have biological functions. Our team considered that the CG8-mers are associated closely with nucleosome binding. We think that the CG8-mers are also related to transcription process of non-coding genes in intergenic sequences, because CG dinucleotides are related closely to CpG island.Based on the three kinds of8-mers, the values of relative frequency (RF) of16kinds of dinucleotide were calculated. From the lower species to higher species, RF value of CG, TT, AA, AT and TA decrease and of CC, GG, GC, TC, CT, CA, TG, AG and GA increase. As far as the RF values of CG and GC dinucleotide are concerned, their changing characteristics are different. Thus, we think the DNA sequences of containing CG dinucleotide are main determinants of transcription of non-coding genes and nucleosome binding.
Keywords/Search Tags:Eukaryote intergenic sequences, 8-mers, Frequencydistribution, Dinucleotide, Bias
PDF Full Text Request
Related items