Font Size: a A A

Research On Support Vector Machine Based Text Classfication

Posted on:2009-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:S X WangFull Text:PDF
GTID:2178360272480090Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text classfication is the key technology of information auto-classfication based on content. It is the process that text categories are classified automatically by computer. There are many features about text classfication: wide spare of text vector, high dimension, comparatively relation among features. As there are more text categories, samples, noises, the classfier is slowed in speed, not well in the results.The extraction and selection of feature and the critical techinque of improved support vector machine are researched in this thesis. In feature selection, new feature selection method which is based classication is propsed. In classical feature selection method, the statistic data are not computed according to categories. The global meaning words are selected. However, the disadvantage is solved in the new method. In text model reprentation, vector space model are used. Feature selection is also used for deciding the weight. In the improved support vector machine classfied method, cluster method is used for handling the sample point firstly, accerlated the speed of classifcation, improved the results.
Keywords/Search Tags:data mining, text classfication, feature selection, vector space model, support vector machine
PDF Full Text Request
Related items