Font Size: a A A

Design And Implementation Of Online Fraud Detection Algorithm

Posted on:2018-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:2348330512483421Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,e-commerce,third-party payment and other online business grows explosively,coming with severe online fraud.As the cornerstone of enterprise's risk control capability,online fraud detection technology plays an important role in identifying fraud cases,recovering losses and avoiding risks for customers and online platforms.Due to the extreme imbalance between online fraud cases and normal transactions,online fraud detection needs to focus on solving unbalanced learning problems.In addition,the character of filering out vast amounts of online transactions makes online fraud detection system always face a large scale of data.In order to detect fraud quickly,online fraud detection system requires capacity of handling big data.Nowdays,there are many disributed computing platform,among them Spark and Spark Streaming is the most popular in this area.The efficiency of enterprise risk control system will be improved significantly combind with technology above.The paper first describes relevant background of the arguments,then further explores big data processing technology,including Spark and Spark Streaming.What's more,the progress of online fraud detection is introduced.Combined with the background of big data,this paper proposes an incremental clustering-based dataset self-balancing construction algorithm and distributed loss-sensitive Lasso algorithm.Finally,with the actual online fraud detection data Set,we evaluate both algorithms using relevant metrics.The main contribution of this paper are:1)An incremental clustering-based dataset self-balancing construction algorithm is proposed to measure the similarity of intra-class samples,choosing representative samples.In this way,imbalanced learning problem about online fraud detection will be solved with timing information retained2)In consideration of online payment fraud detection scenario,a distributed loss-sensitive Lasso algorithm is proposed,which can efficiently learn the model in the context of big data,and effectively improve the loss rate of assets3)Based on Spark and Spark Streaming,an incremental clustering-based dataset self-balancing construction algorithm and a distributed loss-sensitive Lasso algorithm are seamlessly integrated to achieve a better performance in the scenario of online fraud detection.
Keywords/Search Tags:fraud detection, imbalanced learning, loss-sensitive, incremental clustering
PDF Full Text Request
Related items