Font Size: a A A

Research On Classification Algorithms For Professional Theme

Posted on:2008-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:J GuanFull Text:PDF
GTID:2178360215479112Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the information technology, especially the popularization of Internet, the quantity of the webpage presents magnanimity to increase. Because the content in the webpage is the information of the text mostly, how to classify and become the important subject studied at present automatically according to the text information in the webpage. Because in common use search engine is it return to with in common use significance very difficult to meet between user and some professional message demand of field resource to have priority usually. And, the magnanimity and dynamics of the information of the network make any search engine unable to carry on the index to all information. So, the theme search engine which faces a certain specific field becomes an important development trend. It is an important link in this important trend that the text is classified automatically, it means under the categorized system that is giving definitely, judge the course of the classification of the text , so that the search of information automatically according to the content of the text. This text has recommended the text to classify the research current situation at home and abroad automatically at first. Secondly, general course and key technology involved classified to the text automatically, including train sample collecting , characteristic to choose the algorithm , threshold value tactics and key categorized algorithm, have carried on research and explored through the experiment that analyses. Proposed the design plan of a kind of Chinese webpage classifying device finally, and introduced the overall frame of the Chinese categorized system of text based on vector space model, systematic procedure and function module.
Keywords/Search Tags:Search Engine, Data Mining, Automatic Text Categorization, Education Resources, kNN, Vector Space Model
PDF Full Text Request
Related items