Font Size: a A A

A Large Data Clustering Method Based On K - Mediods Improved BIRCH

Posted on:2016-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:X D ZengFull Text:PDF
GTID:2279330464965411Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, the Internet, cloud computing, Internet of things, social networks, intelligent terminal technology, people have been around in mass data. That is "Big Data era". People not only are the data producers, but also the data beneficiaries. People not only can passively accept the data, science and technology, but also people can change role of technology in the life, and can enjoy the mode of benefits in the data. The financial industry was born with a large amount of data, "Financial Big Data" originated in the financial information. A large number of traditional financial data exist for business, it does not have the property attribute. In the "Big Data era", diverse data has been changed essentially, data is not only a product of business information, data is an asset. The financial enterprise may establish a comprehensive risk management system through huge amounts of data, can be refine management which is realized through the massive data, and also can improve the quality of customer service through the massive data, enhance the competitiveness of enterprises.With the development of China securities market, securities enterprises continue to emerge, it lead to a customer resource competition phenomenon between the enterprises. China stock market gradually from a "buyer" market gradually into a "seller" market. Securities enterprises’ competition is through new customers addition, gradually shifted to strategic thinking how to stabilize the stock of customers. Customer service is a big challenge. Securities enterprises is not "using" customers to earn more money, but to help customers improve return rate. To return to the essence of customer relationship management, securities enterprises should center on how to help customers improve profitability.This paper uses diverse, multi-type, multi-level customer clustering as the breakthrough point, and provides technical support for customer service and the appropriateness of classification management for securities enterprises.Firstly, this paper invent a combination clustering method proposed for a large data set, this method is named BIRCH.K-mediods. The BIRCH.K-mediods method is improved K-mediods clustering method based on BIRCH. The BIRCH method has anti-outlier performance, scalability, incremental performance, efficiency and other characteristics. Construction of CF(cluster feature) tree of BIRCH can compress data under the smaller loss of information. Then using K-mediods clustering method to cluster the CF tree structure, it can further improve the clustering method in anti-outlier performance, accuracy and stability.Secondly, according the recorded data from the securities trading, the paper builds turnover rate, position rate, stop profit point, stop loss point, holding time, liquidity of property and other feature extraction algorithm in risk preference, risk tolerance, trading habits and liquidity aspects. In the algorithm, using the robust statistics: the median instead of mean, it enhance anti-outliers performance of feature extraction process, improve the accuracy and reliability of the results.Finally, this paper uses robust feature extraction algorithm and BIRCH.K-mediods method in clustering securities enterprises 662 customers of more than 200 transactions in 10 years nearly. And identifies six kinds of customer, and realizes the diverse, multi-class customer classification.The proposed BIRCH.K-mediods method satisfies requirements of clustering method in the "Big Data era", these requirements include anti-outlier performance, scalability, incremental performance, efficiency, accuracy and other characteristics. So, BIRCH.K-mediods is one of the efficient clustering method in " Big Data era ". Through the securities trading record data to extract dynamic features, it can be described in many aspects of the client state, can increase ability of customer identification and characterization of securities enterprises. And according to clustering result from these dynamic features, it can help securities enterprises to seek customers individualization demand. This paper not only mines customer features in the details, but also fully understands the customer’s risk tolerate capacity. It not only can satisfy requirements of the regulatory management departments, but also be one of the prerequisite of appropriate services, especially it’s important for investment products matching at different risk.
Keywords/Search Tags:Big Data, BIRCH.K-mediods, Securities Trading Data, Feature Extraction, Robust
PDF Full Text Request
Related items