Font Size: a A A

Research On Text Classification Based On SVM

Posted on:2017-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhangFull Text:PDF
GTID:2348330485958372Subject:Information Science
Abstract/Summary:PDF Full Text Request
Text C lassification is the key technology in processing and organization large amount of text information, it can effectively improve the efficiency of information management and use, and it has become an important research direction in the field of data mining.This paper intensive research on feature reduction and classification algorithms after Analysis methods involved in Chinese text classification,like C hinese word segmentation?text representation?feature reduction? classification algorithms, and evaluation of result.The mainly content in this paper includes following aspects:1. Analysis the Classification performance difference between KNN algorithm and SVM algorithm after using both algorithms in text C lassification, mainly analysis the advantages and disadvantages of using SVM algorithm for text Classification.2. Analysis the relationship between the classification per formance and kernel function,the kernel function choose Polynomial kernel function and radial basis kernel function. Provide guidance in selecting parameter after optimization text classification performance through adjusting the kernel function parameter.3. Introduced the latent semantic analysis(latent semantic analysis, LSA) to the text classification field, proposed a new text classification process named LSA_SVM that using LSA reduction characteristics dimension and using SVM for classification, the results show that it can obtain high accuracy and the accuracy also can be stable when using LSA_SVM method for text classification.4. Research the influence cause by imbalance of classification data, analysis the reasons about difference in classification after compare the difference between Long text and short text in text classification.
Keywords/Search Tags:Text classification, Support vector machine, Kernel function, Latent semantic analysis
PDF Full Text Request
Related items