Statistical Model Based Chinese Named Entity Recognition Methods And Its Application To Medical Records

Posted on:2018-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Xu

Full Text:PDF

GTID:2348330518994899

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

To perform many tasks in the field of natural language processing,it is necessary to perform an accurate and effective named entity recognition result.The development of research on named entity recognition is often constrained by Natural Language Processing technology,and vice versa.Moreover,the research process of Chinese named entity recognition is much later than that in English,as well as there is no clear word separator in the Chinese grammar structure,so it is more difficult to identify Chinese named entity.In addition,there are a large number of professional lexical and syntactic features in the field of medicine,so that the research threshold of Chinese named entity recognition in this field is raised.In this thesis,the current existing named entity recognition methods are summarized,then more reliable named entity method based on the statistical model is proposed.Furthermore,currently applied to the field of medicine methodsare applied their own method of manually annotated training data,because there is not open united medical corpus.Inspired by Deep Learning fine-tuning method in the model training,we used the fine-tuning method on the statistic model.Based on statistic model,which is trained by news annotating corpus,the fine-tuning method combined with medical professional dictionary is applied.We haveobtained a good performance in the named entity recognition task on the Chinese clinical electrical records.This method effectively reduces the workload that has to annotate for the training model in the early stage of the named entity recognition,andavoids the subjective bias caused by manual tagging training corpus.The experimental results show that the proposedoptimization algorithm is effective for the hidden Markov model and the conditional random field model,and the accuracy is improved by 6.8%and 10.5%respectively,and the recall rate is increased by 8.9%and 11.1%respectively too.Finally,based on the recognition results of 1066 real Chinese clinical records in this work,the combination method of rules and dictionary is applied to extract the key information of medical records.And according to the medical logic rules,the potential information in the key information is analyzed in the above mentioned experimental process,and a set of feasible research methods is summarized and explored.

Keywords/Search Tags:

Chinese clinical medical records, Named Entity Recognition, Hidden Markov Model, Conditional Random Fields, Model optimization, Key information extraction

PDF Full Text Request

Related items

1	Development Of Computational Methods For Extracting Information From Chinese Electronic Medical Records
2	Research On Algorithm And System Implementation On Named Entity Recognition For Chinese Electronic Medical Records
3	Research On Chinese Named Entity Recognition
4	Recognition Of Named Entity In Electronic Medical Records Based On Cascaded Conditional Random Fields
5	Chinese Named Entity Recognition Based On Conditional Random Fields
6	A Study On Chinese Location Names Recognition Based On Conditional Random Fields
7	Conditional Random Fields Based English Name Entity Recognition
8	The Research Of Conditional Random Fields Based Chinese Named Entity Recognition
9	Research On Chinese Named Entity Recognition For Information Extraction
10	Research Of Named Entity Recognition Based On Conditional Random Fields