Font Size: a A A

An Name Entity Recognition System Based On Text Category

Posted on:2010-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:X WanFull Text:PDF
GTID:2178360278966396Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As one of the Artificial Intelligence research fields, Natural Language Processing is a kind of technology that obtain , display, and apply knowledge by computer, which provide much more efficient, covenient way for the connection between human beings and computers. People use Internet to get information and knowledge more and more, so automatic computer understanding is the direction.Nowadays, there are two main methof for name entity recognition: professional rule, and statistical knowledge. Because we can not cover all the language rules ,the name entity recognition system made by professional knowledge will not arrive at a satisfied result. Although statistical knowledge can get a better result, the value for statistic computation is high, and it will not cover the special situation (small probability). So the main method for name entity recognition turns to the combination of statistical and professional knowledge.Our paper propose a new method based on the combination between statistical and professional knowledge. Nowadays, most of the Name Entity Recognition systerm use the knowledge and statistic method. In order to guarantee the precision rate and recall rate, they often use complicated statistic module, which makes the system run slowly. For solving that problem, our paper based on Natural Language Processing technology, and text processing technology, to make the systerm more effective.Our method use a new process of the Name Entity Recognition. It contains follows:1. Using news spider to get the whole news which is on the Internet, to become test data. 2. Automaticly parsing web page module used to obtain the news data on the web. Meanwhile, processing the encoding problems. Get together the whole parsed news data.3. Categorize the segmented text by using text category method. The reson of category based on the research result. Every text category reflect a special Entity Recognition Process logic.4. Using different Entity Recognition process to recognize the categorized news data.In order to guarantee the generalization and effectivity of the experiment, the news test resources all come from the news site on the Internet.Finally we find that, the model we proposed improves the processing efficiency greatly, while the exact rate and recall rate do not become much lower, it is closed to a real name entity recognition system much more. So the model is meaningful and valuable.
Keywords/Search Tags:Name Entity Recognition, Natural Language Processing, Maximum Entropy, Expert Knowledge
PDF Full Text Request
Related items