Font Size: a A A

Study Of Nucleosome Positioning Based On Similarity Between Part-total Features Of DNA Sequences

Posted on:2020-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:M Y LuFull Text:PDF
GTID:2370330596492652Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Nucleosomes were the basic units of eukaryotic chromatin,and their positions were closely related to a variety of biological processes,such as DNA replication,RNA splicing and chromatin remodeling.Many scholars were involved in the study of nucleosome positioning and proposed many nucleosome positioning models with the development of high-throughput sequencing technology.In order to further explore the nucleosome positioning mechanism,this paper improved theoretical prediction models of nucleosome positioning.In detail,based on feature of self-similarity in DNA sequence,this paper provided two novel core DNA prediction models by using increment of diversity and relative entropy,and then verified effectiveness of proposed models with public datasets.First,based on the self-similarity and increment of diversity in DNA sequence,a generalized increment of diversity model(GID-BP)was proposed to predict core DNA by using k-mer information of DNA sequence.In order to verify its effectiveness,the proposed model was applied to nucleosome positioning of human,worm,fly and yeast.Experimental results showed that the classification accuracy rates on human,worm,fly and yeast datasets were 87.89%,89.76%,85.50% and 99.94%,respectively.Secondly,based on the self-similarity and relative entropy in DNA sequence,a generalized relative entropy model(GRE-SVM)was proposed to predict core DNA by using k-mer information of DNA sequence.Similarly,the proposed model was applied to nucleosome positioning of human,worm,fly and yeast to verify the effectiveness.Experimental results showed that the classification accuracy rates on human,worm,fly and yeast datasets were 88.61%,88.46%,83.76% and 100%,respectively.Meanwhile,this paper analyzed key factors in the pattern of nucleosome positioning by calculating contribution rate and spearman correlation coefficient in GID-BP model.Similarly,this paper analyzed key factors in the pattern of nucleosome positioning based on random forest in GRE-SVM model.Detailed process of the key factor analysis method based on random forest was as follows.1)Weights of feature vector associated with nucleosome positioning was calculated by the method of random forest;2)Feature weights obtained by random forest were analyzed by comparing with the pre-set threshold;3)Feature vectors closely related to nucleosome positioning were obtained based on the comparison results.Experimental results showed that key factors affecting nucleosome positioning in different organisms were different.1)GID-BP model showed that positive and negative six nucleotide sequences played important roles in nucleosome positioning;2)GRE-SVM model showed that 5 sub-sequences,including positive four-nucleotide,positive five-nucleotide,negative five-nucleotide,positive six-nucleotide and negative six-nucleotide sequences,played important roles during nucleosome positioning for all experimental organisms.
Keywords/Search Tags:Nucleosome Positioning, Generalized Diversity Increment, Generalized Relative Entropy, BP Neural Network, Support Vector Machine
PDF Full Text Request
Related items