Font Size: a A A

Research And Application Of Biomedical Entity Recognition Method Based On Depth Boundary Combination

Posted on:2022-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2514306530480744Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the number of literature related to biomedical research has increased dramatically,which contains a large amount of biomedical knowledge,which is very important to the research and application of biomedicine.Biomedical Literature Mining(BLM)technology helps researchers to mine and extract knowledge from those literature automatically,which has attracted more and more attention.Biomedical Named Entity Recognition(Bio-NER)is a basic task of BLM,and it plays a significant role in various downstream tasks of BLM.Compared with entities in the general field,biomedical entities are more complex,and the phenomenon of nesting or discontinuity of entities is widespread.However,most of the existing work focuses on extracting flatten entities,ignoring nested entities and discontinuous entities.In addition,most biomedical named entities do not follow an unified nomenclature and have many typical field features,but there is only lower efficiency.To this end,this paper proposed a Bio-NER method based on on depth boundary assembly method.The research work of this paper can be divided into the following three parts:(1)This paper proposed a Bio-NER of depth boundary assembly method to recognize nested entities and discontinuous entities in biomedical literature.The depth boundary assembly model is a cascading framework based on neural networks,which is mainly consisting of three steps:boundary detecting,boundary assembling,and entity discriminating.First,identify the beginning and end boundaries of the entities,and then combine them into candidate entities.Finally,the candidate entities are discriminated and classified.The experimentation had verified the method proposed in this paper can effectively identify flatten entities,nested entities and discontinuous entities in biomedical literature,improved the performance of Bio-NER obviously,and obtained 81.34%F1-score on the GENIA data set.(2)This paper proposeed a CRF-combined boundary assembly method for Bio-NER method to make more effective use of the domain features of biomedical entities.This paper used the CRF algorithm to make full use of the original features,part-of-speech features,prefix/suffix features and other feature information of biomedical vocabulary as well as deeper semantic information,combined with the flexibility of the depth boundary assembly framework to complete the Bio-NER task.Experiments showed that on the basis of the depth boundary assembly model,adding domain features can improve the performance of the experiment effectively.(3)Design and implement the Bio-NER tool.Based on the CRF-combined boundary assembly method,a friendly interface and easy-to-use Bio-NER tool is designed and implemented,which is convenient for scientific researchers.This paper proposed the Bio-NER method based on deep boundary assembling,which use the features of the biomedical field effectively to identify flatten entities,nested entities and discontinuous entities in biomedical literatures,and lays the foundation for the downstream tasks of BLM.Based on this method,a Bio-NER tool that is convenient for researchers to use is constructed.
Keywords/Search Tags:Biomedical Named Entity Recognition, Deep Learning, CRF, Information Extraction, Depth Boundary Assembly
PDF Full Text Request
Related items