| Clustering algorithms have a wide range of applications in data mining,pattern recognition and machine learning.It is an important part of data mining technology.The emergence of massive data makes the application of data mining technology emerge one after another.Cluster analysis is the basic operation of big data processing.The clustering algorithm divides similar elements into one class and divides the elements with large differences into different classes.In this paper,various classical clustering algorithms are studied.The density clustering and density peaking algorithms are studied.On this basis,the corresponding improved algorithms are proposed and the improved algorithm is applied to the book recommendation system.The specific research content is as follows:(1)Aiming at the complexity of the density clustering algorithm,an improved algorithm W-DBSCAN is proposed to reduce its complexity by using the Warshall algorithm.In the density clustering algorithm,the data with high similarity is density-connected.This paper constructs a matrix(nxn),where the element(x,y)is marked with 1 and the data x and y are directly density-reachable,and then utilized.The Warshall algorithm calculates the reachable matrix of the matrix,and the reachable matrix refers to the data connected by the density.Through the Warshall algorithm,the solution density connection problem is transformed into the solution reachability matrix problem,thus reducing the complexity of the algorithm.(2)A new integrated clustering algorithm is proposed for the problem that the density peak algorithm needs to manually select the center point on the decision graph and is not suitable for all data.First,the data object with the highest local density is taken as the first centroid;secondly,the W-DBSCAN algorithm is used to cluster the first cluster;then the data with the highest local density is found from the remaining data,ie no The data with the highest local density among the classified data is used as another center,and the clustering is continued by the W-DBSCAN algorithm.Finally,by repeating the above steps until all the data is processed,the algorithm ends.(3)Aiming at the problem that college students blindly choose books in the school library or do not know which books are suitable for their own reading,a cluster recommendation and collaborative filtering algorithm is used to propose a college book recommendation system.The first type of results obtained by IDF algorithm clustering is all the reader is most interested in the content,and this part of the content is recommended for readers who have joined the library to solve the "cold start" problem.First,collect and organize the data;then cluster the reader's historical browsing records,that is,divide the readers;finally,use the collaborative filtering algorithm to calculate the Top-n neighbor set of the target reader and generate recommendations. |