Font Size: a A A

A Study On Chinese Simple Noun Phrase Recognition

Posted on:2015-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y X SunFull Text:PDF
GTID:2298330467486701Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since both Basic Noun Phrase (BNP) and Maximal Noun Phrase (MNP) have limitations in the task of machine translation based on the Noun Phrase, We put forward a new kind of Chinese Noun Phrase called Chinese Simple Noun Phrase (SNP), which can help reduce the errors in machine translation caused by the inconsistency of phrase recognition so as to improve translating accuracy.In terms of the recognizing methods, we put forward a mixed strategy composed of statistical model and post-process rules, we revise the results of the statistical recognition with the post-process rules. In the phase of statistical machine learning, we use the CRFs as the sequence labeling model for the model training. For the purpose of getting better model, on the basis of the basic feature template, we import the trigram templates which contain more information. We combine the trigram templates with the basic templates to get the full feature template. The model trained by the full feature template gets the precision, recall and f-value in NP recognition is88.45%90.33%和89.40%,which got a0.04、0.05and0.07percent improvement compared with the model trained by the basic template.In order to improve the precision of the NP extraction, we generate a group of post-process rules focusing on the errors in the statistical results with the aid of some resources such as single-verb dictionary, Synonyms cilin and word collocation frequency database. We established four kinds of rules, including the parallel rules based on the semantic similarity, the single-verb rules based on the single verb dictionary, the degree adverb rules and prop-noun rules. The experiment results show that combined the post-process rules, the precision, recall and f-value of the NP extraction is90.07%,90.62%,90.34%,which got a1.62percent improvement in precision compared with the statistical method.
Keywords/Search Tags:Information extraction, Simple Noun Phrase, CRFs, Semantic similarity, post-processing
PDF Full Text Request
Related items