Font Size: a A A

Artificial Neural Network Technology Applied To The Prediction And Diagnosis Of Multi-omics Data Of Inflammatory Diseases

Posted on:2022-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:Q R HuangFull Text:PDF
GTID:2480306554977299Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Precision medicine has shifted from universal conventional treatments to individualized targeted treatments,and guided targeted treatments based on the molecular background(including biomarkers)of each patient.With the development and maturity of equipment and technology in the field of multi-omics,many researches involving integrated analysis of multiple omics have emerged.Such research often involves a large amount of data,and machine learning methods have a powerful ability to process and understand data,and recognize complex patterns to classify or predict new cases.Machine learning has become a recognized powerful helper in precision medicine.Among them,the deep learning technology with artificial neural network MODEL as the main tool can effectively simulate the objective function.In theory,as long as there is enough training data and the number of neurons,a three-layer neural network can approximate any complex function.Deep learning has been applied in all aspects of life.In clinical practice,deep learning technology based on neural networks is mainly applied to the field of disease diagnosis based on image information such as X-ray and CT.Research on deep learning MODELing using multi-omics is still severely lacking,which wastes huge database resources.This article uses Coronavirus disease 2019(COVID-19)and inflammatory bowel diseases(IBD)as modeling objects.In response to the current most difficult clinical problems,we modeled the samples hierarchically,and constructed several models that can accurately predict target labels with very few biomarkers.(1)The first chapter is based on the background that COVID-19 has spread rapidly in China,the United States,India,Japan and other countries around the world.We apply deep learning artificial intelligence technology to the diagnosis and prediction of COVID-19.Its application is mainly in the non-invasive diagnosis of disease diagnosis,severe disease classification,and prognosis prediction of COVID-19 patients.At present,the clinical diagnosis of COVID-19 is mainly through nasopharyngeal swab sampling.However,the existing medical technology cannot judge the future direction of the disease from ordinary blood tests and other clinical indicators.In the face of this series of problems after the diagnosis of COVID-19,we hope to obtain blood biomarkers through deep learning modeling,give patients the correct prediction in the early stage of diagnosis,rationally allocate medical resources,and improve the survival rate of patients.The idea of the whole research is that we use blood detection technology to obtain important information about the downstream events of DNA—proteome and metabolome.Through feature engineering screening of high-dimensional omics,model parameters are optimized to obtain key features that fit the target classification.4 deep learning models with high accuracy rate that can be directly applied to the clinical frontline.The multiomics data used in this study includes individual clinical data,proteomics and metabolomics,and a total population of 94 people.In Model 1,we use 79 samples(health:non-COVID:COVID=16:22:41)to screen out 7 biomarkers from 607 features,and the prediction accuracy rate in the test set reaches 100%.In Model 2,we use 32Covid-19 samples(non-severe:severe=24:8)to select 4 biomarkers from 418 features,and the prediction accuracy of the test set is 100%.In Model 3,12 patients with severe COVID-19 were included,and the total features of 613 were reduced to 12,with R2=0.9981.In Model 4,23 COVID-19 patients who were discharged successfully and 16 healthy people were included as controls,a total of 39 people,613 features,and 8biomarkers reached R2=0.9532.(2)In the second chapter,we apply the deep learning technology based on artificial neural network to the non-invasive diagnosis of inflammatory bowel disease(Inflammatory Bowel Disease,IBD).IBD is mainly divided into two types: Crohn's disease(CD)and ulcerative colitis(UC).At present,the differential diagnosis of IBD relies heavily on invasive testing methods such as gastrointestinal endoscopy and biopsy histopathological testing,which brings great inconvenience and pain to patients,and is not conducive to early diagnosis and regular testing after treatment.In this project,by collecting and sorting out existing database information,we obtained individualized,multi-omics test data of 299 stool samples from a total of 100 people in healthy people,CD and UC groups,including metagenomics,metatranscriptomics,proteomics,metabolomics,viromics,faecal calprotectin.By using a variety of feature engineering methods,this project screened and evaluated a total of 155,228 features from six omics,and obtained 111 features to form the optimal feature combination,including the metatranscriptome and metabolome.Furthermore,we use deep learning technology based on neural networks to establish a three-category diagnosis model of health,UC,and CD,and the area under the curve(AUC)of the test set reaches 0.8280.In order to more accurately establish an individualized diagnosis model for a specific population and improve the accuracy rate,this project incorporates the concept of stratified modeling,carries out self-evaluation and stratification of the included population,and evaluates the performance of the model separately.The results showed that among the people who selfevaluated as "VERY WELL",we used 59 features to obtain an AUC of 0.8503.Among the people who self-assessed as "SLIGHTLY BELOW PAR",we used 22 features to obtain an AUC of 0.8355.The final screening features include only metabolome and metatranscriptome features.Through Wilcoxon rank sum test,we found that compared with NON-IBD,many biomarkers in UC and CD were significantly up-regulated,including C18n?QI6575,HILn?QI3904,HILn?QI3222,and C18n?QI382.At the same time,compared with UC,the expression in CD is suppressed to a certain extent,which may imply that there is a compensatory effect in CD.
Keywords/Search Tags:inflammatory diseases, machine learning, multi-omics, noninvasive, precision medicine
PDF Full Text Request
Related items