Font Size: a A A

Research Of Automatic Categorization System For Chinese Text About Complaining Information

Posted on:2011-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2178360305488810Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, network has been the main source from which people get information. With the rapid development of Internet, information resources have been much enriched. As the key technology to organize and manage information effectively, text categorization has the broad applied future.This article is based on a project of Ideal Institute of Information and Technology in Northeast Normal University. The project is an intelligent and integrated-services network used for public telephone of the Mayor in Changchun. The author has researched the problem of how to categorize the information in complaining from citizens effectually. The foundation of the original categorization system in the project is statistical method. Although some result has been acquired in the practical application, the weak point has appeared is that the precision of categorization is not good. In order to improve the precision of categorization, the author endeavored improvement of method. The major researches include that: firstly, on the basis of Key words phrase, a system library of categorization in the area of complaint has been structured. The author has made in-depth research on formal description and form of memory of Key words phrase. Secondly, a fuzzy thesaurus has been set up for extending the synonyms in segmentation dictionary in order to improve the precision of word segmentation. Finally, the author has researches on segmentation and categorization algorithms which always used for automatic categorization system. Improved algorithms of maximum matching method and how to use improved KNN about feature aggregations within Key words phrase have been given. To combine with the major researches before, the author has put forward and realized an achieve architecture of Chinese text categorization system on the basis of complaining information. And then the author discussed the key steps of system which is in motion, tested the performance about categorization of this system. The results of the experiments proved that, precision has been improved effectually,and recall has been improved also. So that, the improvement the author has finished of this system is effectual and feasible.
Keywords/Search Tags:Chinese text categorization, word segmentation, key words phrase, fuzzy thesaurus, KNN
PDF Full Text Request
Related items