Font Size: a A A

Research On Telecom Fraud Number Recognition Based On Decision Tree

Posted on:2020-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiFull Text:PDF
GTID:2439330602963581Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the development of economy,society and telecommunication industry,when we are enjoying the convenient services brought by the rapid development of telecommunication industry,the fraud in telecommunication industry also increases and the losses caused are increasing.Telecommunications fraud has not only brought unbearable economic losses to residents,but also tarnished the corporate image of telecom operators,thus triggering a series of social problems,seriously affecting the happy life of residents and disturbing social harmony.Facing the increase of fraud and the losses it brings,telecom operators must invest manpower and material resources to curb the occurrence of fraud.Therefore,the research in this paper is based on how to accurately predict and classify suspected fraudulent users at the technical level.This paper mainly studies the prevention and control of telephone fraud in telecommunication fraud,that is,how to classify and predict users accurately and how to identify fraudulent users' telephone numbers accurately.On the basis of full investigation and combined with the experience of professionals in the industry,this paper first carries out preliminary data preparation work,including positive sample oversampling,index selection,and index segmentation processing.Secondly,exploratory analysis is made on attribute variables.Chi-square test is carried out for fraud and grouping of multiple indicators such as age,gender,call duration,calling times and network access duration.The results of significance test for different indicators and fraud are obtained.Finally,the attribute variables are further screened according to the results of exploratory analysis.Finally,the decision tree classification model of telecom fraud is established by using CHAID algorithm of decision tree.The validity test,prediction accuracy and recall rate of classification models with different depths and decision trees are discussed,and a relatively better decision tree model is obtained.According to the classification results,it is found that the call frequency has the most significant influence on whether or not to cheat,and the attributes such as network access duration and package block have significant influence on whether or not to cheat.According to the results obtained from the decision tree model,the classification rules of fraudulent users are summarized,and combined with the actual situation,the prevention and control suggestions of operators on telephone fraud are given.
Keywords/Search Tags:Decision tree, CHAID algorithm, Data mining, Identification of telecommunication fraud numbers
PDF Full Text Request
Related items