Font Size: a A A

The Study Of Chinese Text Classification Based On FOA-SVM

Posted on:2015-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:B XueFull Text:PDF
GTID:2298330452494461Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With coming of the era of big data, people fully realize the importance of data,facing ofgrowing huge amounts of data.How to excavate and sort out these data has become the hotspot of attention.Text classification technology is a very valuable subject on era of bigdata,can help researchers to deal with intelligent retrieval, information filtering, web pageclassification, sentiment analysis.Text categorization is the use of computer technology,canautomatically decides text category one or more in the condition the predetermined numberof a given text, including text representation model, text extraction method andclassification methods of key technologies.The text feature selection and classificationmethods directly affect the effect of text categorization.Support vector machine can solve the problem of small size and high dimension, has astrong learning ability and generalization ability in hot topics on the study of textcategorization problem,fitting precision and generalization ability of support vectormachines depends on the nuclear parameters of kernel function and punish parameterslargely,the traditional optimization algorithms such as particle swarm optimization in theparameter optimization of SVM fall into local extremum easily, so fruit fly optimizationalgorithm presented for obtaining parameters of SVM.Classification accuracy calculationformula act as flavor concentration decision function.Tests on UCI standard data sets showthat, compared with practical swarm algorithm, genetic algorithm and colony algorithm, theoptimization algorithm has strong global search ability and good robustness.Classificationaccuracy calculation formula act as flavor concentration decision function.Tests on UCIstandard data sets show that the optimization algorithm has strong global search ability andgood robustness and achieve the higher classification accuracy,which is compared withpractical swarm algorithm, genetic algorithm and colony algorithm.Selected six categories of2490piece of TEXT from Internet sogou corpus on the basisof the study summarizes the text feature extraction and classification algorithm.thencompare with five kinds of feature selection methods under different dimension,Showstrongest performance stability evidence algorithm under the900dimension of textcharacteristic and Text evidence method.The fruit fly optimization algorithm optimizedsupport vector machine (SVM) is applied to text categorization problem.In feature selectionmethod is to provide evidence and global text under the condition of900d for KNN, thecharacteristic dimension of SVM, the PSO-SVM experimental comparison,classificationresults evaluation index is the tallest of the four kinds of classification in macro averagerecall rate value, the average recall ratio and precision ratio value.The FOA-SVM canachieve high precision of modeling, strong generalization ability.
Keywords/Search Tags:text classification, support vector machine, fruit fly optimizationalgorithm, parameter optimization, classification algorithm
PDF Full Text Request
Related items