Font Size: a A A

Big Data Applications In Financial Field – Algorithms Research About Detecting The Illegal Using Of Merchant Category Code

Posted on:2016-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:W B XieFull Text:PDF
GTID:2308330473455175Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the big data era, traditional industries face with great challenges. With the pull of big data w ave, these traditional industries come to change thei r business patterns. Meanwhile, with their valuable d ata, large financial institutions stand in the best position of the inf ormation value chain. By renting service f or small financial institutions and business customers, large financial institutions can acquire large number of trade informations and customer informations. Instead of processing paym ents, they are more interested in collecting the datum and mining the potential value of the datum, with the inspiration of the big data. Chin a Unionpay(CUP), the syndicate of bank card of china, is the pivo tal of the bank card industry in chin a. As the CUP has the great advantage on collecting datum and mining the potential value of the datum, the “Big data applications in financia l field” projects, wh ich are based on their m ain data, have been carry out. In this thesis, we carry out the works about “Big data applications in financial field – Algorithm s research about detecting the illegal using of m erchant category code”. The main research content is divided into two part s, basic algorithms research and applied research:(1) Based on the data m ining technology, we carried out m any researchs on clustering algorithms and pr oposed a new hierarchical clustering algorithm – Root Searching algorithm(RS). RS algorithm uses greed strategy to fi nd the representative node(root) in the denser data area by s earching given nodes’ nearest neighbors iteratively, at the same time, create a sub-tree. The experiments results show that our RS algorithm exhibits best accuracies on most datasets. Moreover, RS algorithm has liner time complexity which is signif icantly better than oth er traditional hierarchical clustering algorithms.(2) We proposed a m odel about detecting the illegal using of m erchant category code. We have done a lot of analysis on the dataset of trade inform ations. Then, we found the regularity in the trade infor mations and proposed two concepts: “industry pattern” and merchants’ “behavior pattern”. Using the d ifferences between industr y pattern and merchants’ behavior pattern, we can calculate multiple parameters which are based on the differences between industries or inner industry. And these parameters will be served as the features of eigenvector to train the classification model for detecting the illegal using of merchant category code. Five classification models have been use in the experiments. The results show that our model is workable and the accuracy and recall of our model are all above 80% on three out of the four datasets.This thesis not only carried out theo retical innovation work on data m ining technology, but also proposed a new model about detecting the illegal using of merchant category code which combined with the applicat ion scenario of big data applications in financial field. A lot of experim ents show that our algorithm and model is usable and reliable. The researches in this thesis will enrich da ta mining technology and will provide meaningful reference on detecting fraud in financial field.
Keywords/Search Tags:big data, data m ining, hierarchical clustering algorithm, nearest neighbor searching, financial fraud, illegal using of merchant category code, industry pattern
PDF Full Text Request
Related items