Font Size: a A A

Automatic Recognition And Parsing Of Chinese Maximal-Length Noun Phrase

Posted on:2010-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:C DaiFull Text:PDF
GTID:2178360272985239Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The maximal-length noun phrase is a special sort of noun phrase which acts as the subject and the object commonly in a sentence. Thus, the automatic recognition of the maximal-length noun phrase helps to the shallow parsing, and it has applied value in many natural language processing domains, such as information retrieval, information extraction, machine translation, etc.This paper firstly introduces the current status of the research at home and abroad and compares different methods, then proposes an automatic recognition method of Chinese maximal-length noun phrase based on statistics and rules. At last, the phrase structure grammar parsing of the maximal-length noun phrase is realized. Works in this thesis include:First, the recognition research of the maximal-length noun phrase. The recognition is transformed to the sequence label task formally. Two universal statistical models, namely the maximum entropy model and the conditional random field model, are adopted to make experiment respectively, and then the conditional random field is chosen as the statistical model in the recognition system after comparative analysis. Through careful error analysis, a post-processing rule base is constructed to do rule-based post-processing to above result, and the F-score is 90.0% in the open test of recognition system.Second, the parsing research of the maximal-length noun phrase. The construction of the phrase structure grammar parsing tree is transformed to a layer label abstractly, an automatic parsing method of the maximal-length noun phrase based on cascaded conditional random fields is proposed to realize congregating and transferring phrases between layers, and the parsing tree of the maximal-length noun phrase is achieved at last. The parsing accuracy is 85.1% in the open test of parsing system.By the tasks at the two aspects above, the recognition and parsing method in the system is confirmed, and the final recognition and parsing system of the Chinese maximal-length noun phrase is constructed so as to recognize and parse the maximal-length noun phrases of the input sentence or text. Future work is need for receiving better performance.
Keywords/Search Tags:Maximal-length noun phrase, Maximum entropy model, Conditional random fields, Rule-based post-processing, Phrase structure grammar
PDF Full Text Request
Related items