Font Size: a A A

Research On Chinese Named Entity Recognition And Field Application In Inspection And Quarantine

Posted on:2020-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X Z LiangFull Text:PDF
GTID:2428330575959733Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the economic globalization and the rapid popularization of the Internet,a large amount of information about goods flowing,disease controlling,food security ensuring and so on,is uploaded to the internet platforms.The inspection and Quarantine department needs to extract the key content from news and take corresponding action based on such information.Chinese named entity recognition is an important part of information extraction,and in the field of inspection and quarantine information extraction,product entity as the main object of information has also become the focus of entity recognition in this thesis.Chinese named entity recognition has many difficulties because of its own characteristics,such as no obvious word boundary,various entity types,complex structure and so on.The range of product entities in the text is wide,and there is no strict rule to follow.Therefore,how to recognize product entities from Chinese corpus quickly and accurately has important research significance and application value in the inspection and quarantine field.For the Chinese named entity recognition technology,this paper has carried out some research in the following aspects.In this paper,we proposed a framework for recognizing Chinese named entities in specific fields.The framework consists of two modules:semi-automated corpus construction and Chinese named entity recognition model.The build of the semi-automated corpus rely on a phrase extraction algorithm of mutual information and left and right entropy,which is an unsupervised phrase recognition and in this way we establishes a set of candidate entities.The Chinese named entity recognition model module is based on the framework of combining the neural network and conditional random field.This paper combines different neural networks and conditional random field to form composition models,then proposes a method for Chinese product entity recognition based on the lattice long and short time memory network and conditional random field model.The three entity recognition models:IDCNN+CRF,BiLSTM+CRF and Lattice LSTM+CRF were compared in the contrast experiment,which uses the datasets of MSRA,Boson,People's Daily corpus and Chinese resume.We explore the performance of each model to recognize Chinese named entities on different domain.The construction and application of the Chinese named entity recognition framework has been realized in the field of inspection and quarantine.The phrase extraction algorithm based on mutual information and left and right entropy realizes the semi-automatic construction of the labeled corpus,and the realization of the Chinese product entities recognition is based on the lattice long and short time memory network and the conditional random field model.Compared with the experimental results of traditional artificial features + CRF,BILSTM+ CRF model,Lattice LSTM + CRF model shows better performance,and the experiment also proves the feasibility and effectiveness of the named entity recognition framework.
Keywords/Search Tags:Chinese Product Named Entity Recognition, the Lattice LSTM model, Conditional Random Field, Phrases Extract
PDF Full Text Request
Related items