Font Size: a A A

Text Categorization And Application In Secret File Management

Posted on:2007-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:X C DongFull Text:PDF
GTID:2178360212958430Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The 21st century is an information age, the global information quantity increases exponentially every day. Therefore, we are being placed in an embarrass circumstance, in which resource is abundant, and instead knowledge is scarce. It has become a focus on how to obtain the contents which we need rapidly and accurately. As a consequence, text categorization based on artificial intelligence emerges with the tide of the times. Given category system, text categorization is a process in which text will be categorized on the basis of its original contents automatically. From the view of mathematics, we can also say that text categorization is a process mapping the unclassified text to a categorization.Research on the key techniques and typical methods of text categorization are being done, and the method of text categorization based on word vector space model is presented in the dissertation. Main works are as follow.Firstly, the dissertation gives a brief introduction to text categorization about its background, current research, basic concepts and working flow.The key techniques of text categorization, such as word segmentation, text expression, weight calculation and feature selection are introduced. The principle and algorithm of K- nearest neighbor, Naive Bayes, Support Vector Machine Nerve Network etc. are elaborated.Aim at the characteristic that speed of NB is the fast, which of KNN is hypo-, the precision of SVM is the highest, which of KNN is hypo-, the dissertation according to the principle that the term in common use in Chinese is finite, the text categorization method based on word vector space model is presented. Experiment has proved that the precision and stability of this method is elevated, the time and space efficiency is elevated prominently.Lastly, this categorization method is applied to secret file management system, it has been designed and implemented.
Keywords/Search Tags:Text Categorization, Word Vector Space Model, Secret File Management
PDF Full Text Request
Related items