| Nucleosome is the basic unit of chromatin structure in eukaryotes,and it is the carrier of epigenetics phenomena such as chromatin remodeling and histone modification.Nucleosome positioning is related to many factors.DNA sequence is considered to be one of the most important factors affecting nucleosome positioning.Study the method of nucleosome positioning could promote the understanding of nucleosome mechanism,and has important biological significance for further elucidating nucleosome positioning characteristics and structural functions.The main work of this paper is as follows:Firstly,based on the horizontal features of a single DNA sequence,we define the association information and the correlation curve coordinates,and a new nucleosome positioning method is proposed based on the features of the correlation curve,which is named as ZCNN method.This method transforms the DNA sequence into the correlation curve coordinates,extracts the features along each position of the DNA sequence from the DNA horizontal direction,and the coordinate matrix of DNA sequences is obtained.Convolutional neural network was used to train and test.The results showed that the classification accuracy of ZCNN method was 92.87%,80.88%,87.39% and 80.96% in S.cerevisiae,H.sapiens,C.elegans and D.melanogaster datasets,respectively.Secondly,based on the vertical features of DNA sequence dataset,a new comprehensive DNA sequence model is constructed,which not only retains the horizontal features of DNA sequence,but also takes into account the vertical features of the overall dataset.A novel method for predicting nucleosome positioning based on synthetic sequence DNA is proposed,which is named CSeq FM method.Support vector machine is used for training and testing.The results showed that the classification accuracy of CSeq FM was 98.33%,84.60%,83.92% and 84.81% in S.cerevisiae,H.sapiens,and C.elegans and D.melanogaster datasets,respectively.Thirdly,based on the position weight matrix and Z-curve model,a novel DNA sequence integration model is constructed.The model can more comprehensively represent the features of DNA sequence.A method for predicting nucleosome positioning based on the integration model is proposed,which is named ZCMM method.Support vector machine is used for training and testing.The results showed that the classification accuracy of ZCMM method was 99.17%,77.72%,85.34% and 93.62% in S.cerevisiae,H.sapiens,C.elegans and D.melanogaster datasets,respectively.ZCMM was applied to predict the nucleosome occupancy rate in S.cerevisiae genome.The results showed that the predicted results of ZCMM method were very similar and consistent with the real nucleosome occupancy map,which proved that ZCMM could effectively predict the nucleosome occupancy rate.Finally,we studied the distribution characteristics of nucleosome positioning in human genome,and four nucleosome positioning patterns were proposed,and found that the distribution proportion of four nucleosome positioning patterns in the genome was not balanced.By mining the genes related to nucleosome positioning,we found that the nucleosome positioning patterns were involved in cell differentiation,Molecular metabolism,transcription termination and other important biological processes through gene function enrichment analysis.In particular,the dynamic positioning of nucleosome in the genome plays an important role in the regulation of gene expression.In a word,based on the features of DNA sequences from different perspectives,three nucleosome positioning prediction methods are proposed in this paper.The results show that the prediction methods have achieved higher accuracy in multiple species,and have good stability,reliability and effectiveness,and the distribution characteristics of nucleosome positioning in human genome have strong dynamics.The results of this study are beneficial to the systematic study of nucleosome and its positioning methods from different levels,and to further illustrate characteristics and functional mechanism of nucleosome positioning,and to offer help for further understanding epigenetics. |