Font Size: a A A

Research On Classification Module Of Core Competency Assessment System

Posted on:2007-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:H D ShenFull Text:PDF
GTID:2178360182973361Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
According to the statistical data of the last Forrest Research: "more than 80 percent data exist in terms of non-structurization among Internet and Intranet. Therefore it is more difficult to do knowledge find towards these non-structurization information, while its meaning will be greater. It combines the method and technology of Web Text Mining based on these text information analysis and process. Among them categorised technology is the core of Text Mining. This text is directed primarily to classified module of key ability to appraisal system, which solve the problem that how to classify a large amount of disorder information.First of all, it puts forward the problem based on this text, and summarizes the current situation of Web Text Mining at home and broad. And then it regards the Text Categorization Technology as the key research object.Secondly, it deeply sums up and analyzes the basic theory knowledge of Text Mining Categorization Technology, then select Vector Space Model as classified algorithm of classified module.Thirdly, according to the deficiency of traditional term weight algorithm, it puts forward term weight algorithm based on the concept that is considered distributed information.Fourthly, it designs and realizes the classified module that is within the key ability to appraisal system, and also verifies improved term weight algorithm is superior to traditional term weight algorithm in the respect of both recalland precision。Finally, it summarizes and forecasts the direction of further study.
Keywords/Search Tags:vector space model, term weight, non-structurization, feature selection, text categorization
PDF Full Text Request
Related items