Font Size: a A A

Research On Text Clustering Algorithm Based On Particle Swarm Optimization Algorithm

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:M F WangFull Text:PDF
GTID:2428330611467600Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of modern Internet Information technology and the continuous popularization of mobile intelligent devices,the information resources on the Internet are growing rapidly.As the main carrier of information dissemination,text contains a lot of technical information and hidden knowledge.The rich value of massive data resources and the need of enterprises to improve production efficiency have prompted a large number of researchers to invest a lot of energy to the field of data mining.Text clustering,as an important part of data mining,has also been concerned.Traditional clustering algorithms,such as kmeans algorithm and its derivative algorithm K-means + + are highly dependent on the initial clustering center,and there are limitations in their own updating methods.when solving the problem of high-dimensional text clustering,there may be some defects such as poor clustering effect and unstable algorithm,which can't achieve the clustering effect expected of users.The particle swarm optimization(PSO)algorithm in the heuristic swarm intelligence algorithm overcomes the shortcomings of K-means algorithm to some extent due to its population decentralization,strong adaptive ability and efficient population evolution,but the updating efficiency,global optimization ability and algorithm stability of this algorithms may still have room for further improvement.Aiming at the above problems,this paper proposes a new differential evolution with particle swarm optimization algorithm to improve the efficiency of population renewal,and applies it to text clustering.This paper mainly studies the processing flow,swarm intelligence algorithm and clustering algorithm of text clustering.Firstly,the text preprocessing,text representation model,text similarity calculation and clustering evaluation index involved in text clustering are introduced,and the requirements and main technologies of these processes are analyzed in detail.Then this paper introduces the related concepts and characteristics of swarm intelligence algorithm,and focuses on the background,principle,process,advantages and disadvantages and improvement strategies of PSO.Then aiming at the problem that the group intelligence algorithm generally ignores the consistency of the arrangement of cluster centers in the process of population renewal,a method of self-adaptive adjustment of the arrangement of cluster centers based on the similarity matrix of cluster centers among individuals is proposed,which standardizes the arrangement of cluster centers contained by any pair of individuals involved in the process of individual renewal to ensure the same dimension as much as possible The similarity of clustering centers is the largest,which improves the efficiency of individual updating.Finally,by analyzing the limitations and characteristics of traditional particle swarm optimization algorithm and differential evolution algorithm,we comprehensive utilize the advantages and applicability of different algorithms,and propose a new differential evolution with particle swarm optimization algorithm which show better and more stable performance.The algorithm is based on PSO algorithm.When the population update of PSO algorithm is stagnant and the search space is limited,the crossover and mutation operations of DE algorithm are used to disturb the population,increase the diversity of the population and improve the global optimization ability of the algorithm.Finally,the algorithm is tested on the general data set of text mining,and the validity and feasibility of the algorithm are verified.
Keywords/Search Tags:Text clustering, K-means, particle swarm optimization, differential evolution, cluster center index adjust
PDF Full Text Request
Related items