Font Size: a A A

Research On Improvements Of Chinese Part-of-Speech Tagging System Based On Statistical Model

Posted on:2010-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:C TangFull Text:PDF
GTID:2178360278466156Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of computer and information technology, natural language is being more and more used is realistic human-computer interaction, then Natural Language Processing (NLP) technology gets deeper and wider research in recent years. Part-of-speech tagging is one of the fundamental tasks in natural language shallow treatment, which provides necessary foundation to other high-level missions such as grammar and semantic analysis plays a vital role in the actual NLP applications and be of great significance of NLP, also has been extensive and deeply researched for a long time.As one of the most common theoretical tools in computer science, statistical model is well used in NLP area. In this paper, we study the performance of existing statistical models in Chinese part-of-speech tagging applications, and analysis the character of statistical models. We also study the performances of different models by experiments, then for the inadequacies of original statistical models in accessing context information we put forward concepts about reverse model. In order to further improve the tagging accuracy of statistical model, we study the error tagging in the results and introduce improvements based on rule-methods, get rules by association rule mining algorithms in Data Mining (DM) area, and then increase error-correction module for statistical model. In addition, we also do researches on the structure of statistical models, and make some improvements which will improve the efficiency of tagging system. The experiments show that improvements we made not only improve the operation efficiency of system and tagging accuracy, but also make part-of-speech tagging system performs better in scalability and adaptability to environments.
Keywords/Search Tags:Chinese part-of-speech tagging, statistical model, Mining Association Rules, Reverse model, Brill algorithm
PDF Full Text Request
Related items