Font Size: a A A

Improvement And Application Of Affinity Propagation Clustering Algorithm Based On Semi-Supervised Learning

Posted on:2019-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:N B WangFull Text:PDF
GTID:2428330566958722Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet information technology,the coming of the era of big data has caused explosive growth in the size and complexity of data in all sectors of society.The production of massive data makes people use data mining technology to analyze and process the data,so as to obtain the valuable information behind the data.As an important research direction in data mining,clustering technology is characterized by the ability to identify the potential distribution of data from large-scale data sets.Affinity Propagation(AP)is an unsupervised clustering algorithm proposed by American scholar Frey et al.in Science in 2007 Compared with some traditional clustering algorithms,the algorithm has the advantages of fast convergence speed and high clustering accuracy when dealing with large-scale data sets,and it also achieves good results in practical applications.However,there are still some problems with the AP algorithm itself:(1)when the algorithm is used to cluster high-dimensional data,due to the "dimensional effect",it is difficult for AP algorithm to find an appropriate clustering structure,resulting in poor clustering effect;(2)AP algorithm is suitable for dealing with the tight spherical clustering problem of hyper spheres.However,when facing complex data sets,the algorithm tends to produce more clusters,leading to poor clustering results.Therefore,it is of great significance to improve and apply the theory of the affinity propagation clustering algorithm.The main contents of this paper are as follows:(1)To solve the clustering problem that the affinity propagation clustering algorithm is difficult to deal with high-dimensional data,it proposes a semi-supervised affinity propagation clustering algorithm based on locality preserving projections.The algorithm uses locality preserving projections algorithm to reduce the dimension of high dimension data,eliminates the redundant dimension of the data,the algorithm uses the known pair constraint information to adjust similarity matrix,and finally clusters by affinity clustering algorithm.(2)For affinity propagation clustering algorithm,when dealing with data sets of complex structure,the algorithm tends to produce more classes.Meanwhile,the original algorithm is sensitive to the setting of the biased parameter P value.It proposes an affinity propagation clustering algorithm based on semi-supervised hierarchical optimization.The algorithm introduces the idea of semi-supervised,sets a certain percentage of the label data,and uses the affinity propagation algorithm to cluster the data sets.The clustering results obtained are jointly guided by the established supervised and unsupervised information matrix,and the final clustering results are obtained by the combination of hierarchical optimization.(3)By combining the affinity propagation clustering algorithm based on semi-supervised hierarchical optimization with stock value investment theory and using the advantages of clustering algorithms in dealing with massive data,mining the valuable and critical information of listed companies from the basic financial information and providing a more effective method for application of value investment in the stock market.(4)By combining the affinity propagation clustering algorithm based on semi-supervised hierarchical optimization with the bank customer division and the algorithm is used to analyze and process the related data information of bank customer service,so that the bank manager can accurately divide the existing customer and make it different.
Keywords/Search Tags:Affinity propagation clustering algorithm, Locality preserving projections, Semi-supervised learning, Stock analysis, Bank customer division
PDF Full Text Request
Related items