Font Size: a A A

Study On M~7G Related Disease Association Prediction Method Based On Convolutional Network

Posted on:2023-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2530306788974679Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
N7-methylguanosine(m~7G)is an important epigenetic modification,which plays an important role in gene expression,processing and metabolism,protein synthesis,transcription stability and other aspects.m~7G is a considerable factor in life processes,like the cause of diseases.Therefore,it is of great significance to search for disease-related m~7G loci to reveal the pathogenesis of disease.Computational models can predict the associations between m~7G loci and diseases with massive data quickly.The data about m~7G loci and diseases is abundant and manifold,but there are still some problems such as sparse associated data and incomplete data mining.Therefore,this thesis predicts the association between m~7G locis and diseases with convolutional network.The specific research contents are as follows.To solve the problem of sparse association data and insufficient feature information mining,a model with heterogeneous network and convolutional neural network(HN-CNN)has been proposed with m~7G loci similarites,disease similarities and m~7G-disease associations.Firstly,features pairs containing similarities and associations have been selected from heterogeneous network,which can enrich the sparse data.Then,the HN-CNN adopts CNN to transform feature pairs into feature vectors,and dense layer has been adopted to determine the weight of feature.Finally,considering results and robustness of the model,HN-CNN selects the ensemble classifier XGBoost as the classifier.Experimental results on the m~7G-disease data show that the HN-CNN can predict the unknown associations between m~7G loci and diseases effectively and accurately.The HN-CNN is essentially a feature extraction model in euclidean space,which can extract multidimensional feature information from insufficient feature information.Considering the restricted information and the interference of redundant data,a model with Ripple Net and graph convolution network(Ripple-GCN)has been proposed,which classifies related data in the graph.Firstly,Ripple-GCN model optimizes the heterogeneous network by graph.It classifies the grade of related data to overcome the interference of redundant data.Then,the model has calculated the feature vectors of m~7G loci and diseases according to the highly correlated data,so that the problem of restricted information can be solved in graph.Finally,Ripple-GCN adopted GCN to extract features and predict unknown associations.Experimental results on the m~7G-disease data show that reducing redundant data in graph helps improve the performance of the Ripple-GCN model.This thesis contains 23 figures,7 tables and 123 references.
Keywords/Search Tags:m~7G sites-diseases, heterogeneous network, convolutional neural network, association prediction
PDF Full Text Request
Related items