The Increment Clustering Algorithm In Financial Data Mining And Its Application And Research

Posted on:2005-09-22

Degree:Master

Type:Thesis

Country:China

Candidate:X L Sun

Full Text:PDF

GTID:2168360152469262

Subject:Computer application technology

Abstract/Summary:

Clustering analysis in data mining deploys many traditional methods. All these methods have not been considered large volume data sets. However, to efficiently obtain knowledge from large amount of data sets is the top-leading problem in financial data mining area. In addition, traditional clustering analysis has mainly focused on numeric data rather than other types of data that exists in financial field. The difficulty to understand the output of clustering is a problem of traditional clustering analysis methods. Therefore, clustering analysis in detecting money laundering aims at improving efficiency of algorithm and ability of processing variant types of data such as document, categorical data etc. and giving the conceptual explanation to the result of clustering.The SAFE-MIDSS (State Administration of Foreign Exchange-Management Information & Decision Support System)has set to research the application of data mining technology in detecting money laundering system. The fundamental issue in financial data mining is to divide large volume data sets into meaningful subset effectively. Since the course of data collection is irregular, as well as the gradual development of market and the lagging of manage system, financial data mining must deal with all the incomplete data with complex attribute, noise and isolated points in the circumstances of lacking background knowledge. Traditional BIRCH algorithm suits for large volume data set due to its characteristic of increment. However, the algorithm could not deal with categorical data by its Summary Clustering method. Although K-means algorithm can deal with categorical data, the high price of computing makes it difficult to be applied to large data set. Basing on the K-means center points algorithm and the BIRCH increment algorithm, the author poses the concept of Core-Tree which could make up the weakness of these two algorithms, That is, using center point to indicate the summary information in BIRCH, and using class core to improve the efficiency of center point orientation. Meanwhile, applying the method based on conceptual model to the data output of the clustering could make the result easy to understand, which contributes to improving the quality of output. Eventually, the author brings forward the project of applying the core-tree algorithm to SAFE-MIDSS, as well as proves the algorithm can reach the prospective purpose.

Keywords/Search Tags:

increment clustering, conceptual clustering, outlier, financial data mining, money laundering

Related items

1	Study Of Data Mining Algorithms Based On Rough Set And Clustering And Application In Anti-Money Laundering
2	Research On The Construction Of Bank Anti Money Laundering System Based On Clustering Mining Algorithm And Outlier Mining Algorithm
3	The Financial Industry Anti-money Laundering Monitoring Report System Research And Design
4	Research On Data Mining-Based Suspicious Money Laundering Transactional Behavioral Patterns Recognition
5	Research On The Application Of Outlier Detection Model In Anti Money Laundering
6	The Research On The Application Of Knowledge Discovery In Databases In Anti-Money Laundering Field
7	Study On Outlier Detection For Suspicious Financial Behavior Recognition
8	Design And Implementation Of Financial Transaction Money Laundering Detection System
9	Design And Implementation Of Bank Money Laundering Behavior Analytial System Based On Data Mining
10	Design And Implementation Of Bank Anti-money Laundering Monitoring System