Font Size: a A A

Improved Affinity Propagation Clustering Algorithms And Their Applications

Posted on:2016-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2348330479980062Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Machine learning is the core of artificial intelligence. It is the research that optimizes the performance of computer algorithm based on previous experience. It has been successfully applied to many fields such as natural language processing, bioinformatics, search engines, medical diagnosis, information security, and securities market analysis, voice and handwriting recognition and so on.Machine learning could be divided into unsupervised learning and supervised learning in accordance with the form of learning. Clustering analysis, as a typical method, is an important branch of unsupervised learning. As a new kind of clustering algorithm, affinity propagation algorithm has many advantages when it is compared with other clustering algorithms such as clustering results are not affected by the choice of initial class represents points, good stability and high iteration speed. Although the algorithm obtains a good effect in practical application, it still has some shortcomings. Therefore, it is great significance to improve affinity propagation algorithm theory and research about application. The major contents could be summarized as follows:(1) Traditional affinity propagation algorithm has inefficient results when conducting clustering analysis of high dimensional data possessing too many properties and overlapping information. It is difficult to find the proper class structure which leads to the failure of clustering results in reflecting the real data characteristics. The author proposes an improved algorithm on the basis of Entropy Weight Method and Principal Component Analysis(EWPCA-AP). EWPCA-AP algorithm empowers the sample data by Entropy Weight Method, then reduces and clusters the empowered data with Principal Component Analysis. The numerical result of simulation experiment shows that the new EWPCA-AP algorithm can effectively eliminate the redundancy of data and improve the performance of clustering. In addition, the proposed algorithm is applied in the area of the economy in our country and the clustering result is consistent with the real one. This algorithm provides a new intelligent evaluation method for Chinese economy.(2) Distance close degree is one of the important functions of fuzzy mathematics. Compared with Euclidean distance, distance close degree has the advantages of eliminating the high value of property and better reflecting the spatial feature of singular sample data. This paper borrows the idea of the distance close degree in similarity measuring function combines it with the Affinity Propagation algorithm which, as a result, forms an improved AP algorithm-Close Measures Affinity Propagation(CM-AP). The result of UCI data sets indicates that the CM-AP clustering algorithm has good robustness, obviously improves clustering effect and expands the ability of the algorithm in dealing with various data. Besides, the CM-AP algorithm is applied to the clustering analysis for the evaluation of listed companies. The experimental result of financial indicators data of comprehensive profitability shows a consistency between the clustering result and the real one. It could reduce the investment risk and provides reliable basis for investors to choose stocks and rationally invest.(3) The cuckoo optimization algorithm is a new swarm intelligent search algorithm which has less parameters and high ability of optimizing. The cluster performance of traditional Affinity Propagation algorithm could be affected by the initial deflected preferences. Combining the cuckoo optimization algorithm with the Affinity Propagation clustering algorithm, the author proposes an improved AP algorithm Semi-supervised Affinity Propagation(CS-SAP) on the basis of Cuckoo Search. In order to improve clustering performance, semi-supervised method is introduced into CS-SAP algorithm and it directs the process of clustering the data with class labels and automatically acquires the optimal deflected parameter value through the cuckoo search algorithm. The clustering experiments have been done by using the UCI database. The simulation results show that the CS-SAP algorithm has expected results and it facilitates the clustering speed and improves the clustering effect.In recent decades, it appears a lot of new optimization algorithm in the field of global optimization computation. They have aroused general concern and been applied in various fields for that they could simulate human intelligence to realize data intelligent processing. Therefore, the efficient intelligent data mining will be more important and valuable.
Keywords/Search Tags:Affinity propagation clustering algorithm, Entropy weight method, Principal component analysis, Similarity measure closeness, Cuckoo search algorithm, Semi-supervised
PDF Full Text Request
Related items