Font Size: a A A

Research And Application Of The Clustering Analysis Based On Improved RNA Genetic Algorithm

Posted on:2019-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:L Y RenFull Text:PDF
GTID:2428330548954706Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence and the advent of the era of big data,data resources have been increasing explosively.Data mining is an important technology of data analysis,clustering analysis as the main tool of data mining,has been widely used in various fields,such as image processing,pattern recognition,information retrieval and so on.At present there have been many clustering methods,but each kind of clustering algorithm is not perfect,therefore we can not only improve the algorithm itself but also use some specific optimization algorithm to optimize it.It can make up for the defect of the algorithm and improve the insufficiency.Therefore,this article chose the more widely used two methods of cluster analysis,fuzzy c-means clustering and density peaks clustering,as the main research content,and RNA genetic algorithm is used to optimize these two kinds of clustering algorithm,the following is the main research work of this article.An adaptive RNA genetic algorithm(ARNA-GA)was proposed.In the new selection operation,the penalty coefficient d is introduced to ensure that the latter d percent of the population will not be retained to the next generation,so that the solution space is close to the optimal solution space.The new crossover operation and mutation operation are designed to increase the diversity of population and prevent the premature convergence of the algorithm.For crossover and mutation operation,we designed the adaptive strategy,dissimilarity coefficient is introduced to judge the dissimilarity of the individuals,crossover operation is carried out when the dissimilarity coefficient is larger than the threshold,on the contrary,mutation operation is carried out.It can reduce the amount of calculation of the algorithm,prevent redundant operations and improve the convergence speed.Finally,the performance of the algorithm is evaluated by eight standard test functions,and the validity of the algorithm is proved.A kernel fuzzy c-means clustering algorithm(ARNAGA-KFCM)based on adaptive RNA genetic algorithm is proposed.First of all,the original fuzzy c-means clustering algorithm based on objective function is improved,the gaussian kernel function is introduced to replace the original distance calculation of FCM method,the improved kernel FCM is robust to isolated point and noise point.The proposed adaptive RNA genetic algorithm is combined with improved fuzzy c-means clustering,RNA genetic algorithm is used to optimize the initial clustering centers of the fuzzy c-means clustering,and then use the KFCM algorithm to guide clustering,It breaks the defect of original FCM clustering algorithm,which is sensitive to the initial clustering center,and because RNA-GA is a kind of genetic algorithm with optimum algorithm search strategy,So it can help FCM algorithm jump out of local optimal guidance to realize the global convergence and enhance the global search ability of the algorithm.Finally,the validity of ARNAGA-KFCM was verified by four kinds of UCI datasets.By comparing with FCM and KFCM,it proves the superiority of the proposed algorithm in this paper.A new density peaks clustering algorithm(ARNAGA-KNN-DPC)based on adaptive RNA genetic algorithm and KNN is proposed.First of all,the original density peaks clustering algorithm was improved,the idea of KNN was introduced to improve the calculation of the local density in DPC.It makes the calculation of local density be no longer affected by cut off distance.The proposed adaptive RNA genetic algorithm and improved density peaks clustering algorithm based on KNN are combined.RNA-GA is used to find the threshold values of local density and relative distance,according to the ideas of the density peaks clustering,the points with high local density and high relative distance are defined as clustering centers,we can easily determine the clustering centers with the thresholds founded by RNA-GA,It indeed has solved the defect of the original DPC algorithm,which has difficult in determining the clustering centers.Finally,we use the UCI data sets and synthetic data sets to test the performance of the proposed algorithm,compared with the popular Min_Max_SD algorithm and classic K-means clustering algorithm,It has proved that the proposed ARNAGA-KNN-DPC algorithm is more effective.We proposed the improvement of the algorithms in this paper from the perspective of theoretical feasibility,and through the MATLAB simulation experiments,the proposed algorithms proved more effective than original algorithms.from the perspective of practice,this paper puts the proposed ARNAGA-KFCM algorithm into text classification experiment,in the experiment,the text data is provided by the database from sogou labs,the result of the experiment shows that the proposed algorithms in this paper is not only feasible in theory,but also can harvest a very good effect in practice,the research has certain practical value.
Keywords/Search Tags:RNA Genetic Algorithm, The Fuzzy C-means Clustering, Density Peaks Clustering, Text Categorization
PDF Full Text Request
Related items