Font Size: a A A

Combination Of Machine Learning Methods Named Entity Recognition Research

Posted on:2007-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y G ShiFull Text:PDF
GTID:2208360185455746Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named Entity Recognition (NER) technologies have become a hot problem of Natural Language Process recently.The definition of Named Entity by MUC(Message Understanding Conference) is the proper nouns and the quantifiers that people are Interested in.NER can be classified to person-name,location,organization,date,number and so on.NER has been applied on many compute linguistics tasts as a subtask of Information Extraction,such as machine translation.Just as most of the Natural Language Process technologies,the methods of NER have two classes, statistic-based and rule-based.Considering of the limitation of using only one of the methods,we combined both of the methods to recognize Named Entity in this thesis .we combined the maching learning with NER to make the system get the ability of self-learning.We have done research on decision tree of maching learning mainly and designed a recognize model to recognize Named Entity.This model first used the probability and statistic way to extract the potential named entities,and then some context linguistic language information are employed in the model to recognize the named entities furtherly.As the wrong entites are denied ,the recongnize effect has been improved. By using the methods above,we mainly researched on Chinese person nameand location.The result of the experiments shows that the effort of the strategy based on rules and statistics is better than use only one of them.in the same experimental condition ,the model combined on machine learning is constructed simply , has better adaptability and self-learning ability.This thesis is mainly classified to four models .1.Text preprocessing.2.Chinese name and location recognization based on statistics and rules.3.Chinese name and location recognization combined with maching learningmethods.4.Eliminating the ambiguities of Chinese name and location.
Keywords/Search Tags:Named Entity Recognition, machine learning, statistics and rules, decision tree algorithm
PDF Full Text Request
Related items