| With the development of information technology,the traditional business model is no longer adapted to the current development of the banking industry,banks pay attention to their own assets,but more importantly,recognize the importance of customer management,with good customer relations is the banking industry to achieve great wealth and income protection.Therefore,the research on bank customer behavior has become one of the hot topics in today's research.However,a large number of bank customer information rely on manual processing methods is difficult to manage,data mining method is an effective technology to solve this problem.By digging deep into the behavior of bank customers,we can provide reasonable and effective suggestions for the formulation of bank marketing strategy.k-means clustering algorithm is a more effective algorithm in data mining,which has become a common method for data mining in the big data era because of its advantages such as fast processing speed,simple thinking and easy to realize.However,the algorithm has the disadvantages of being sensitive to the center point of initial clustering and prone to local optimization.In this thesis,in order to improve the efficiency and stability of the bank customer base classification expansion,a improved k-means clustering algorithm is proposed.The algorithm uses self-encoder to carry out feature learning on a given bank customer information data set to reduce the data set dimension,and then combines the k-means cluster optimization algorithm to segment the data set.For clustering process optimization,the customer data set classification is automatically determined by the improved elbow rule,then a portion of the customer data with smaller outliers is filtered out by the local outlier factor detection method as a candidate set for the initial cluster center to find the initial cluster center more accurately,and finally the cluster center isoptimized based on the outlier weighted distance method.In order to achieve unsupervised classification of bank customer data.By selecting the data on the UCI public data set to test the algorithm,its stability is19% higher than the average k-means,11% higher than the average of OFMMK-means,9% higher than the average FCM,and 87.5% shorter than the FCM algorithm.The proposed algorithm can mine bank customer data more effectively. |