Font Size: a A A

Research On Algorithm Of Support Vector Machine Text Classification Based On Improved Density Clustering

Posted on:2017-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z K LiuFull Text:PDF
GTID:2428330566453424Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the Internet is widely used in daily life,the information resources become more and more important in current social resources.In our country,Chinese text is a very widely used information carrier,and plays an important role in the field of information transmission,information processing and so on.But with the expansion of the scale of information,the database of Chinese text document is gradually expanding,and it is more difficult for people to get the information they want.Under this background,the Chinese text automatic classification technology arises at the historic moment.The main content of this paper is based on the research of Chinese text classification based on support vector machine,and the specific work is as follows:Firstly,based on the Chinese automatic text classification technology and support vector machine algorithm research background,significance and research status at home and abroad have made analysis and summary.Then it introduces the related theories and methods of Chinese text categorization,including text preprocessing,feature extraction,text representation and text classification algorithms.Secondly,in this paper,the support vector machine theory and different algorithm to carry on the detailed elaboration,especially for linearly separable problems,linear problem and nonlinear problems such as working principles of different support vector machine models are introduced.Then the paper adopts an improved support vector machine algorithm based on density clustering,and the algorithm is mainly improved after using density clustering algorithm to extract text vector set of edge points as a new training set,which is used to train a support vector machine classifier.This paper then has completed the design of the text categorization system.First,it introduces the overall design scheme,and then introduces the system structure from three main modules: the pretreatment module,the feature selection module and the text classification module.Finally the paper completed the text classification and analysis of the results,respectively,on different feature selection functions,different kernel function for the experiment and the analysis of experimental results are given,finally before and after the improvement of support vector machine classification experiments were carried out and the paper makes an analysis of the results.
Keywords/Search Tags:Text classification, Support vector machine, Density clustering, Extract edge point, System design
PDF Full Text Request
Related items