Font Size: a A A

Entity Recognition And Linking In Chinese Short Text

Posted on:2017-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:X LuoFull Text:PDF
GTID:2348330488485677Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the growing of society's informatization, people are more and more urgent for Natural Language Semantics Understanding System. Presently, the research based on Named Entity (NE) becomes the hot one of semantics research. This paper studied on Named Entity Recognition (NER) and Entity Linking (EL) in Chinese short text, considering the correlation of these two tasks, a method of joint processing is come out.Entity Recognition and Linking are the basic tasks of text analysis, and also the basic support modules of many Natural Language Tasks. Most existing methods use pipeline mode to perform both tasks of Entity Recognition and Linking. Usually first use a NER system to find the boundaries of NE, and then use an EL system associating NE to the specific knowledge base entry. In this mode, errors of NER system will frequently pass to EL system, which lacks of sufficient information to correct these errors. This model may be suitable for long text since the existing Entity Recognition system performs well on the situation of enough training corpus and ideal context, however, its system performance declines much when we are processing with short text.In order to resolve the propagation problem of errors in information extraction and linking task of short text, based on the coupling relationship between two tasks of ER and EL and their potential for mutual promotion effect, we proposed two joint models of processing tasks of NER and EL, which are linear model and model based on Semi Conditional Random Fields (Semi CRF). Researchers usually regard NER as sequence labeling problem and EL task as entity ranking problem. While our linear model regards NER and EL task as ranking problem. It generates as much as possible candidates of "mention-entity", rank for them and select the most suitable "mention-entity". Our joint model based on Semi CRF process with sequence labeling problem, when label its "mention", use as much features related with entities as possible. In the case of joint processing Entity Recognition and Linking, our available features are more abundant comparing with dealing with Recognition and Linking separately. Experiment with data provided by NLPCC2015 indicates that our method is valid for Chinese short text Recognition and Linking.
Keywords/Search Tags:Entity Recognition, Entity Linking, Semi CRFs, error propagation
PDF Full Text Request
Related items