Font Size: a A A

Data Bank Data Warehouse Build Process Of Cleaning And VIP Clients Of The Excavation

Posted on:2014-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:H B LiangFull Text:PDF
GTID:2268330401453171Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, the competition in the banking sector is very intense, the banks have accumulated a lot of customer information resources, accurate and reliable customer information, for banks such enterprises, is very important. The banks of these data sources exist in different databases and files, and database files may exist in different operating systems on different hardware platforms, and thus imported from these heterogeneous data sources to the data warehouse will There are many data quality issues. Data warehouse is a subject-oriented, integrated, non-renewable, changing with the time data collection, it is the basis for decision support, the correctness of the data in the data warehouse is critical to avoid making the wrong decisions the. Data quality is the basis of business intelligence, data quality direct impact on the success or failure of the business intelligence, data cleansing is crucial. Is therefore necessary to carry out cleaning, the data so as to obtain the customer’s real information. With accurate customer information, customer resource management efficiency will be greatly improved. At the same time, the accurate customer information is mining VIP customer data basis.In this paper, data cleansing, data mining concepts, methods and research situation more fully described and briefly describes the bank data warehouse architecture. Data cleansing and data mining technology principles, methods, as well as the basic processes were analyzed.Data cleaning, this paper first introduces the knowledge of data cleaning and cleaning principle, the data bank data warehouse build process cleaning conducted in-depth research, analysis and comparison of duplicate records cleaning near the sorting algorithm, multiple trips neighbor sorting algorithm and the priority queue algorithm, and the cleaning method suitable for banks similar duplicate records. Data warehouse and data mining, the first data warehouse, and bank data warehouse architecture. Then described in detail the definition of data mining and mining algorithms, and focuses on the application of the decision tree classification algorithm C4.5algorithm Bank data mining. Finally, according to the bank customer value indicators and customer screening evaluation rules C4.5classification algorithm to create a predictive model of customer classification. Experiments show that the C4.5algorithm to establish customer classification model to predict the effect is very good.Finally, summarize the research work and look forward to future research.
Keywords/Search Tags:Bank, Data Quality, Approxmiately duplicate records, Cleaning Data, DataMining
PDF Full Text Request
Related items