Font Size: a A A

Extraction Of Tumor - Drug - Gene Semantic Relationship Based On Multi - Core Learning

Posted on:2016-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:L Y WangFull Text:PDF
GTID:2134330461476846Subject:Information Science
Abstract/Summary:PDF Full Text Request
Under the impetus of the science research development, the treatment of cancer has entered into personalized medicine stage, which faced to characters of personal gene expression. Biomedical literatures, serving as a carrier of scientific research results, recorded a large amount of studies on cancer, drug and gene. They provided important support for personalized medicine. However, facing the massive biomedical literature resources, it needs the text mining approaches to identify valuable knowledge.In this paper, a multiple kernel machine learning method was used to automatically extract the Cancer-Drug-Gene semantic relationships from the exponential growth of the large-scale biomedical literatures. Aimed at the data structure characteristics of literature, morphology, syntax and semantic, tree clusters kernels have been involved in multiple kernel machine learning method. Where, vector space kernel and string kernel were selected as lexical kernel, tree kernel was selected as syntax kernel, and WordNet-based semantic kernel function was selected as a semantic kernel. Experimental corpus was collected from CTD (Comparative Toxicogenomics Database, http://ctdbase.org/) including literature curation based chemical-gene interactions, chemical-disease associations, gene-disease associations. PubMed, it is an authority biomedical literature resource, supports the CTD database. In terms of the machine learning algorithm, support vector machine was applied to extract cancer-drug, cancer-gene and drug-gene interactions using tree clusters kernels and ensemble kernel. As shown in the result, ensemble kernel outperforms others in the interactions extraction. For cancer-drug interactions, F-measure is 88.41%. For cancer-gene interactions, F-measure is 85.68%. For drug-gene interactions, F-measure is 71.31% In summary, this paper achieved comparable results to previous work.On basis of this study, a system method was developed to recognize and extract the relationship among cancers, drugs and genes from scientific literature, consequently, a prototype system named as CDG(Cancer-Drug-Gene Relationship Extraction) was developed. It enables cancer, drug and gene named entity recognition, semantic relationship extraction, intelligent retrieval, as well as retrieval result export for further analysis.
Keywords/Search Tags:Literature Mining, Semantic Relationship Extraction, Multiple Kernel Learning, Personalized Medicine, Cancer-Drug-Gene Relationship
PDF Full Text Request
Related items