Research On Chinese Named Entity Recognition With External Knowledge And Application In Medical Field

Posted on:2017-11-02

Degree:Master

Type:Thesis

Country:China

Candidate:J F Li

Full Text:PDF

GTID:2348330503987202

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The main task of Named Entity Recognition is finding out person names, place names, organization names and other entities, as a one of a basic task of Natural Language Processing field, named entity recognition is always one of the hot research points for decades. With the development of the machine learning method based on statistics, the recognition effect of the entities which appeared in the training corpus is very good, but the recognition of the non-landing words is still one of the difficult points of named entity recognition.To solve this problem, we first study the way to merge a lexicon into the traditional CRF mode, hope to make CRF model can identify the entities in the lexicon, experiments are carried out in the general domain using Wikipedia entries.After that, we noticed that in recent years, the rapid development of the depth of the neural network, which RNN and the improved RNN- LSTM has a very good performance in the field of Natural Language Processing. LSTM in theory can use all of the previous text information while training, and Bidirectional LSTM can use the information of the whole sequence.Then we use a Bidirectional LSTM named entity recognition model, the recognizer design is introduced, with many techniques like the dropout, transfer cost calculation, etc. According to the model we implement a named entity recognition tool using Python Theano. We use this tool to do a lot of experiments in the general field, proving that the Bidirectional LSTM model in the named entity recognition task is much better than the CRF model, in many groups of experiments to enhance the F-value of about 2%.In addition, we also use the depth neural network pre training techniques to add more external information in the Bidirectional LSTM model, the experiment shows that there is a certain effect.Finally, we use the CRF model and the LSTM model to test the data in the medical field. The CRF merging lexicon experiments was effective with identifying the entities in the lexicon; compared with the CRF model the effect of bidirectional LSTM model still have a promotion. Bidirectional LSTM model adding pre training vector with a not consistent corpus in the open field, although we lose several performance, but the effect of non-professional medical entities recognition is better.

Keywords/Search Tags:

named entity recognition, external knowledge, conditional random fields, LSTM, medical text processing

PDF Full Text Request

Related items

1	Research Of Web Text Named Entity Recognition Based On Conditional Random Fields
2	Recognition Of Named Entity In Electronic Medical Records Based On Cascaded Conditional Random Fields
3	The Research Of Conditional Random Fields Based Chinese Named Entity Recognition
4	Research On Algorithm And System Implementation On Named Entity Recognition For Chinese Electronic Medical Records
5	Chinese Named Entity Recognition Based On Conditional Random Fields
6	A Study On Chinese Location Names Recognition Based On Conditional Random Fields
7	Named Entity Recognition Based On Conditional Random Fields
8	Named Entity Recognition Based On Conditional Random Fields Chinese Research
9	Named Entity Recognition In Medical Field
10	A Study On Chinese Named Entity Recognition