Font Size: a A A

Research On Biomedical Named Entity Recognition Based On Hybrid Model

Posted on:2018-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y SunFull Text:PDF
GTID:2348330542992633Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Biomedical named entity recognition is the fundamental and key task of biomedical information extraction.Identifying the biomedical named entities accurately has a vital effect on gene relation extraction,biological knowledge discovery and other complex tasks.The complex and changeable characteristics of biomedical named entities have brought some difficulties to the recognition task.Moreover,with the rapid development of biomedicine,the exponential increase of the data quantity has also brought great challenges to the automatic and accurate identification of the biomedical named entities.In this dissertation,we study and explore biomedical named entity recognition based on hybrid model.The main research work of this dissertation are as follows:(1)Rich linguistic features,including word feature,orthographic feature,chunk feature,part of speech feature,etc.are introduced and applied to biomedical named entity recognition task.An incremental feature selection method is proposed,combining with the sequence labeling model Conditional Random Field(CRF)and classification model Support Vector Machine to study the effectiveness of the artificial features and select a reliable feature set for the current model.(2)In order to study and analyze the technology application of deep learning in biomedical named entity recognition,three kinds of deep structure entity recognition models are constructed to compare their performance.The first is bi-directional long short term memory network(BiLSTM).The second is deep CRF.The third is the mixed model called bi-directional long short term memory network-conditional random filed(BiLSTM-CRF)which combines the sequence feature extraction ability of BiLSTM and the sentence level information extraction ability of CRF.In addition,through the combination of word vector and neural network,the recognition task which originally need complex feature engineering becomes more end-to-end.In order to investigate the role of word semantic information in biomedical named entity recognition task,three kinds of word vectors with different sources are introduced.Meanwhile,analyze the influence of the dimension,domain and other parameters on the quality of task related word vector based on the benefit of the actual task.This dissertation uses the JNLPBA standard dataset as the experimental corpus to study on the features and models of biomedical named entity recognition.Without any introduction of rule and dictionary,the F-value reaches 74.93% which verifies the effectiveness of the proposed method and our research.
Keywords/Search Tags:Biomedical Named Entity Recognition, Deep Learning, Word Vector, Incremental Feature Selection Method, Hybrid Model
PDF Full Text Request
Related items