Font Size: a A A

The Relationship Between The 8-mer Conservative Usage Of Human Genome And The Sequence Structure Of CpG Islands

Posted on:2017-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:N N GuoFull Text:PDF
GTID:2180330485466931Subject:Physics
Abstract/Summary:PDF Full Text Request
K-mer use of genomic sequences is a non random, study of the biological functions of k-mer non random use of the law and the characteristics of k-mer, for understanding the genome structure and evolution has important biological significance. Based on the entire human genome, this paper explores the conservation of evolution by studying the frequency distribution of 8-mer in DNA sequences.Therefore, this paper extracts 8-mer from the DNA sequence of the human genome, and sorts them from small to large by their frequency, then for these 8-mer, draw the frequency distribution of the image in the same block, the results showed that the phenomenon of three peaks in their distribution. According to the order from left to right, these peaks are called a peak, the peak two and three. According to the number of XY dinucleotides, dividing the set of 8-mer into three phantom subsets by not include, include one, include two or more, and calling them as denote XY0, XY1 å'Œ XY2, and draw their distribution images. It is found that only the CGo,CG1 and CG2 phantom subsets of CG group form independent single peak, and corresponding to the three peaks of the whole 8-mer. In the condition of the same coordinate system and the constraint group, we draw frequency distribution image of the 8-mer of random sequence and the DNA sequence of the human genome, found that the peak three corresponds to the random sequence, but the peak one and two are far away the center of random distribution, which explains the peak having random, the peak one and two having a strong conservative.Combined with the previous study, we speculates CG2 phantom subsets is the core of CpG islands sequence, in order to verify our conjecture, we extract the CpG islands sequence in the entire human genome, and corresponding to extract the isometric non CpG islands sequence. According to the classification of dinucleotides, we calculate the characteristic variable of each phantom in the CpG islands sequence and non CpG islands sequence. The corresponding distribution is drawn to verify that CG2 phantom subsets is an index of CpG islands classification. In the later research, according to the CG classification of three kinds of characteristic quantity Ktri, we draw the distribution image of CpG islands sequence. It is found that the characteristic quantity of the CG2 classification has the obvious local structure in the CpG islands sequence, once again proved that CG2 phantom subsets is a core of CpG islands sequence. We set a standard to extract representative sequences of local structure, found that its length is concentrated in the 15bp to 23bp, the peak appears in the location of 17bp.
Keywords/Search Tags:human genome, 8-mer distribution, XY dinucleotide classification, CpG islands, local structure
PDF Full Text Request
Related items