Font Size: a A A

Literature Information Extraction Based On Deep Learning And Application In Neuroanatomy Connectivity Mining

Posted on:2021-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y C DiFull Text:PDF
GTID:2504306104494554Subject:Optical Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the field of neuroscience is growing rapidly,and research on neuroanatomical structures and functions with various imaging methods is becoming a hot topic.Thanks to the rapid development of Internet technology,information is mostly published in scientific literature in the form of unstructured electronic texts,thus using advanced natural language processing,optical character recognition,knowledge graph,and other information extraction techniques to automate the acquisition of neuroanatomical knowledge from the neuroscientific literature on a large scale,can greatly contribute to the researchers’ grasp of the results and current states in the field.However,most of the existing information extraction systems in the field of neuroscience is using rule-based and traditional machine learning methods,so the extraction results are influenced by manual rule making and feature engineering,and the accuracy is difficult to meet the needs of scientific research.In contrast,information extraction systems in the field of biomedical tend to use a deep learning approach which can eliminate heavy human involvement,usually has better results and stronger generalization performance,so a natural idea is to extend information extraction techniques in the biomedical field to the neuroscience field.Based on the characteristics of information extraction datasets in neuroscience,this thesis designs corresponding pre-processing methods to enable advanced deep learning information extraction methods to be used in neuroscience,so the datasets can be trained,tested and selected deep learning models on two subtasks: named entity recognition and relation extraction.Then combined with literature acquisition,pre-processing,data post-processing and visual analysis,a handy information extraction tool is constructed for efficient and automated knowledge extraction from the large-scale neuroscience literature in practice.Specifically,the reliability of the extraction model was first verified using a gold standard dataset: for the named entity recognition task,the deep learning baseline model could achieve comparable results to those of traditional models,and with multi-task learning or transfer learning skills,the model was able to significantly increase recall rate and enhance the generalization ability;for the relation extraction task,F1 value of extraction results using the combination of pre-training models and various labeling methods was improved by more than 20% compared to the best traditional models,thus the low accuracy problem of traditional method for large-scale literature extraction was solved.The validity of the information extraction tool was then validated by evaluating intrinsic and extrinsic results while using it to obtain neuroanatomical connectivity pairs directly from the neuroscience literature abstracts.In summary,a handy neuroscience information extraction tool based on deep learning is established and applied to the research of brain connectivity,which solves the complexity and inefficiency problems of traditional information extraction systems.The tool can be used to extract neuroanatomical connectivity information directly from the literature in a large scale.It will be easily generalized to other kinds of neuroscience knowledge when datasets is available.Thus,it increases the possibilities for further constructing the neuroscience knowledge graph.
Keywords/Search Tags:Neuroanatomical connectivity mining, Named entity recognition, Relation extraction, Deep learning
PDF Full Text Request
Related items