Font Size: a A A

Research And Application Of Implicit Relation Mining In Biomedical Entity Network

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2428330629952719Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of artificial intelligence technology,the constantly emerging innovative technologies have played a key role in all walks of life,and the bioinformatics field is also booming under the promotion of the innovative technologies.Undoubtedly,in the era of big data,data has always occupied an important position in the biomedical field.With the development of large-scale high-throughput information technology and the continuous improvement of the quality of contemporary scientific research,high-quality scientific research achievements emerge in endlessly.Moreover,various research teams at home and abroad concentrate a lot of energy on extracting biomedical entity relations from the literatures,and form the biomedical databases gradually.However,there are still two noteworthy problems settled urgently.Firstly,due to the huge amount of scientific research data,the naming conventions of each organization are different,so the same content might be given different names,which will cause great trouble for the subsequent extraction of key information and data integration.Secondly,biomedical research is a comprehensive project correlated and complemented by multiple domain knowledge.Biomedical research divides into multiple directions,and each direction has achieved good performances in specific practice or scientific research papers.However,all directions are almost separate,forming the phenomenon of "knowledge fragmentation".Therefore,in order to solve the above questions,this paper aims to establish a complete set of standards for data integration to fuse the scattered datasets together,and set up a complete and efficient method to mine the implicit information from data.Obviously,its essence is to split multiple directions of research into a system to carry on the implicit knowledge discovery,fill a gap in the current biomedical research field,and provide a new angle of view for scientific research.At present,biomedical researchers at home and abroad have made fruitful achievements in the biomedical research field.Under the powerful help of artificial intelligence technology,the mainstream relationship mining methods in the biomedical field mainly include the following four categories:(1)co-occurrence-based methods;(2)pattern-based methods;(3)machine learning-based methods;(4)deep learning-based methods.Due to the powerful learning ability of machine learning and deep learning,the latter two methods are in a strong development stage.In addition,with the rapid development of the network representation learning,the relationship mining methods based on network embedding models are unique and gradually occupy a place in the research field.This paper tries to find a novel and more accurate method to solve the problem of implicit relation mining under large-scale data fusion.In this paper,we propose a method BEIRNE-kNN for implicit relation mining from biomedical entity network.In the embedding part,we combine two kinds of model(the graph representation learning model GraphGAN and the network embedding model VAE based SDNE)to construct a hybrid model BEIRNE;in the prediction part,we use the traditional machine learning classification method kNN.First,we integrate several large open medical datasets,including CTD,Gene Ontology,HPRD,HMDD and MATADOR.Hence,a biomedical entity relation network that includes gene-disease,gene-pathway,disease-pathway,disease-chemical,gene-GO,miRNA-disease,chemical-protein associations and gene-gene interactions is established.Then,we use BEIRNE to train network nodes to get the vector representations.Next,in this paper,the edges in the network are selected as positive samples and the negative samples are generated by ranking meta-paths.Finally,the positive and negative samples are used to train the classification model kNN to carry out the link prediction tasks,so we can find out the biomedical entity pairs in the network that have no yet direct association but contain implicit relationships.And,the predicted relationships will be validated by PubMed,proving our method's practical application value.Actually,before conducting the application experiment,we have compared our method with three state-of-the-art methods(Katz,Catapult and IMC)on benchmark dataset(OMIM)in specific domain of biomedicine,which predicts the relationships between gene and disease.The proposed method achieves good performances to demonstrate the scientific value of BEIRNE-kNN.
Keywords/Search Tags:network embedding, Variational Auto-Encoder, biomedical relation mining, k-NearestNeighbor
PDF Full Text Request
Related items