Font Size: a A A

The Research Of Hierarchical Automatic Text Classification Based On The Knowledge Database

Posted on:2015-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhangFull Text:PDF
GTID:2268330425488284Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the increasing popularity of computer technology and the rapid development of Internet, the amount of information the user can get shows exponential growth, which greatly enriched the user’s information environment. However, at the same time, it resulted in "information overload" and other issues, increasing the difficulty of the user to obtain the required information resources. Text classification, as a simple and effective solution, is considered a key technology in organizing and processing large amounts of text data, and gained widespread attention, having broad application prospects.The current research for text classification was primarily conducted the algorithm improvements, and this paper improved the performance from the perspective of knowledge organization. It means building a multi-level knowledge database, based on empirical database and combined the category hierarchy of "CLC", and achieving hierarchical classification, based on the knowledge. The paper is divided into four parts:1) Introduction section:The author introduced the background, significance of the research, and gave the main content and structure of the paper.2) Theoretical and General section:The author gave a detailed introduction of knowledge database and multi-level text classification, concerning the related theoretical basis of the two concepts, and summarized related research at home and abroad.3) Experimental design section:Based on the theory and review, the author proposed a new research methods, including two main modules, one is building a multi-level text classification knowledge database based on the "CLC", the second is doing hierarchical classification experiments based on the knowledge database. By training more than60,000empirical data, which involves the1497classes, and testing with300test data, the author confirmed that adding the information about category hierarchy can improve the classification performance.4) Summary section:The author deeply analyzed the results of this study, with shortcomings and further work.
Keywords/Search Tags:Hierarchical Classification, Knowledge Database, Text Classification, "CLC"
PDF Full Text Request
Related items