Construction And Analysis Of Multi-layer Networks Based On Disease-gene-drug Data

Posted on:2019-05-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Yu

Full Text:PDF

GTID:2370330572959003

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Adverse Drug Reactions(ADRs)have frequently been reported clinically.It is estimated that 44,000 to 98,000 people die from ADRs each year in the United States,It undoubtedly increases medical risks and economic losses.DDI predicting can provide data support for the drug experimental and clinical medication.A drug interaction is a situation in which a substance(usually another drug)affects the activity of a drug when both are administered together.The effects include irrelevant,synergistic,additive,and antagonistic effects.Our aim is to provide a drug experimental support and clinical medication tips by predicting potential DDI relationships and giving the probability of adverse drug reactions.We build a Drug-attribute Prediction Model(DAPM)based on the data of four drug attributes,including drug phenotype,therapeutics,chemical structure and genomics.According to the characteristics of the data,the Tanimoto similarity is applied to calculate the similarity of the four features respectively.And we regard them as the model training features.Afterwards,six kinds of algorithms including Logistic regression,decision tree,naive Bayes,support vector machine,k NN,and random forest were used to conduct comparative experiments to predict potential DDI relationships.If the probability is greater than 50%,it is potential ADR.For this model,we propose two optimization strategies based on the data level and the network structure level,which greatly improves the prediction accuracy of the model.1.About the data,due to the lack of normal samples in existing databases,there is a lack of "standard negative sample" data in the classification experiments.So we use One-class SVM algorithm,with ADR data as input to construct a hypersphere,to get a �standard negative sample� which and outside the sphere.To some extent,the algorithm optimizes fuzzy classification tags into trusted standard classification data.2.In terms of network structure,a multi-layered network was first proposed to describe drug interactions.We constructed a four-dimensional multi-layer network model,including fourlayers including drug phenotypes,therapeutics,chemical structures,and genomics.Further,combined with L�vy-random walk algorithm and restart random walk algorithm(RWR),a RWLR random walk algorithm based on multi-layer network is proposed to optimize the network structure description.The walkers start from each node of the network and then obtain a steady-state matrix to obtain a better network structure.In each step of the experiment,the grid search experiment was used to obtain the optimal parameters.Finally,we evaluate the network topology and model performance.In the topological characteristics,analysis based on multi-dimensional indicators such as degree distribution,centrality,and the aggregation coefficient shows that single-layer networks and multi-layer networks are consistent in the chemical structure layer and the phenotypic layer.Although the genetic layer data in the multi-layer network are locally compact,the topological characteristics also show consistency,verifying the small-world nature of the network.In the aspect of model performance evaluation,firstly,we use the 7319 �standard negative sample� data obtained from the One-class SVM algorithm as the training set to carry out a 10-fold cross validation experiment.Then,experimental performance was evaluated based on four indicators of accuracy,recall,accuracy,and F1 scores,as well as AOC and AUC values.The results show that the data optimized by the One-class SVM algorithm has better classification performance than the random sample data,raising the AUC from an average of 0.6 to 0.9,with a random forest classification accuracy of 0.98.Finally,the training data and forecast data are input into the DAPM model,and 83543 potential DDI relationships are predicted,of which 782 are high-risk DDI.Among the three very high-risk drugs,these three drugs and more than 80 drugs produce ADRs,namely Clozapine,Fludarabine and Levonorgestrel.

Keywords/Search Tags:

DDI prediction, machine learning, data mining, multi-layer network, random walk

PDF Full Text Request

Related items

1	Multi-granularity Complex Network Representation Learning Based On Random Walk
2	Link Prediction In Multi-layer Collaboration Network
3	Prediction Of Reservoir Using Multi-component Seismic Data Under Multilayer Network Structure Of Machine Learning
4	Research On Methods Of Identifying Disease Genes On Biomolecular Networks
5	Machine Learning Methods For Chromatin Accessibility Prediction By Integrating Multi-omics Data
6	Link Prediction Method For Opportunistic Networks Based On Random Walk And Deep Learning
7	Study On Protein Function Prediction Based On Random Walk
8	Research On Network Representation Learning Algorithm Based On Random Walk
9	Research Of Multi-source Meteorological Data Based On Machine Learning
10	Learning Augmented Graph Random Walk Based Bioinformatics Entities Association Prediction