Font Size: a A A

The Research Of Text Classiifcation Algorithm Based On KPCA And SOFM Neural Network

Posted on:2013-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:X X WangFull Text:PDF
GTID:2248330374467001Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In the era of rapid growth of network information, how to quickly and accuratelyobtain valid information through the network has become the focus of the presentstudy of the issue. Text classification retrieval algorithm is an important means torealize information retrieval, intelligent text classification based on the content of thetext as the basis of information management, are widely used in text filtering, textorganization, subject tracking and detection, etc.Based on thorough research of text classification retrieval algorithm technology,in view of the characteristics of nonlinear text data and the shortage of traditionalcharacteristic dimension reduction algorithm and the classification algorithm, putforward the dimension reduction algorithm based on the characteristics of the kernelprincipal component analysis and the organization characteristic map neural networktext classification algorithm.Kernel principal component analysis (KPCA) is a kind of multiple data statisticsand analysis of technology, In the treatment of high dimensional nonlinear problemhas a great advantage, and relative to feature selection can provide moreinformation,self-organizing feature mapping (SOFM) neural network can processparallel data distribution information on a large scale, in addition, ability to learn,convergence speed, can realize the global optimal, and self-organization clusterfunction.Combined the advantage of KPCA feature dimensionality algorithm with theSOFM neural network algorithm, constructed classification model. Firstly accordingto the non-linear character of text data, use the kernel principal component analysis(KPCA) algorithm to finish feature extraction and dimensional reduction, it use theinput space predefined kernel function calculated the vector point product of featurespace directly, to implement the feature space noise reduction, dimension reductionand correlation removing, to finish preparing work before complete classification,And then use the SOFM neural network to make text classification, this algorithm hasthe very strong learning, imagine, tolerance and robustness ability; Finally compared the text classification algorithm with BP neural network and RBF neural network.Through the experiment simulation contrast, This algorithm has higher classificationaccuracy and faster classification speed than the BP neural network and RBF neuralnetwork.
Keywords/Search Tags:Text Classification, Feature Dimension Reduction, Kernel PrincipalComponent Analysis, SOFM Neural Network, RBF Neural Network
PDF Full Text Request
Related items