Font Size: a A A

Analysis And Prediction Of Nucleosome Positioning Based On Tsallis Entropy

Posted on:2015-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:J WuFull Text:PDF
GTID:2250330431454114Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
As the fundamental unit of eukaryotic chromatin structure, nucleosome plays critical roles in gene expression and regulation by controlling physical access to transcription factors. It plays important roles in the processes of DNA replication and repair, RNA splicing, gene transcription and regulation. Nucleosome positioning indicates where nucleosomes are located with respect to the genomic DNA sequence.Firstly, based on the study of existing models, considering the preference of nucleosome distribution to DNA sequences, combining with Tsallis entropy and the double helix structure of DNA, we extracted the frequencies of k-mers (k=2,3) in DNA sequences by statistical methods and proposed a new model to extract nucleosome positioning information. Based on this model, each sequence is converted into an8-dimensional vector, which is convenient to further researches by mathematical and physical methods. Next, this paper presents the nucleosome occupancy model to determine the average nucleosome occupancy of each basepair along genomic sequence. Finally, we proposed the peak detection model to locate nucleosomes along genomic sequence. When applied to train the support vector machine(SVM) for the discrimination of known nucleosomal and linker DNA sequences, the nucleosome positioning information model achieves high AUCs of0.9182,0.8922,0.9163,0.8261and0.9109for Human, Medaka, Nematode, Candida and Yeast, respectively, which have significantly outperformed the previous studies, which illustrates the effectiveness of nucleosome positioning information model. Bsides, when compared with some published results(Kaplan et al[10]., Segal et al.[5], Yuan et al.[4]), it shows that our nucleosome positioning model is effective for nucleosome positioning with high accuracy.This paper has the following aspects achievements:First, by considering the Pearson correlation, this paper divides four nucleotides (A、C、G、T) into two groups, which simplifies the nucleosome positioning information model.Second, introduce the concept of transformed Tsallis entropy into the nucleosome positioning studies, which have broadened the research ideas.Third, we do the similarity analysis of sequences through the basic concept of distance, which greatly simplifies the computational complexity and reduce the calculation of large nucleosome dataset.We use the nucleosome positioning information model, nucleosome occupancy model and peak detection model to obtain the nucleosome positioning patterns along genomic sequence of Saccharomyces cerevisiae. But the factors which influence nucleosome positioning are complicated, such as the dependence of the DNA sequence, competition and cooperation of the protein molecule, ATP-dependent remodeling. If we can give a more systematic analysis of the factors to obtain a more comprehensive nucleosome positioning model, the results will be better predicted. In addition, nucleosome positioning mechanisms for different organisms are not the same, we need to further apply our model into some more complex eukaryotes, such as human, to determine the scope of our proposed method and direction of improvement. Currently, the researchers have not given a completely objective and accurate nucleosome positioning pattern, predictions are different derived by different platforms and methods. Therefore, we need the help of experimental methods to further determine the accuracy of our model.
Keywords/Search Tags:Nucleosome positioning, Nucleosomal DNA, Linker DNA, Tsallisentropy, Support vector machine (SVM)
PDF Full Text Request
Related items