Telecom fraud is a major hazard in today’s information society.With the continuous upgrading of science and technology,telecom fraud methods are constantly changing.At present,most of global telecom frauds are committed by phone contact.Moreover,it increasingly presents new characteristics such as precision intelligence and event chain.As a result,the method of fraud has shifted from indiscriminately to intelligently and precisely.Identifying telecom fraud has become a major problem for related companies.Based on the above questions,this thesis conducts relevant research on the intelligent detection technology of fraudulent calls.Considering that telecom fraud often involves the use of telecommunication services,this thesis propose a fraudulent call detection method based on graph embedding and vertical federated learning,which includes data feature extraction and multi-party feature joint training.The model is verified on a self-built datasets of fraudulent calls.The detailed is as follows.1.Aiming at the difficulty of data feature extraction in fraudulent call detection,a communication graph embedding model is proposed,which is based on telecommunication data.In this model,targeting at telecom service data characteristics,a homogeneous graph embedding model and a bipartite graph embedding model are introduced respectively to extract the embedded features of telecom users in different business scenarios.In order to extract the interactive behavior characteristics of telecommunication users,the extraction of graph embedding features is carried out for the user’s communication unweighted graph and weighted graph respectively.And on this basis,the detection of fraudulent calls is completed by combining the user’s communication statistical features.For two classification algorithms commonly used in the industrial field,the detection accuracy rates of the two algorithms reached 69.67%and 83.33% respectively when graph is unweighted.For the weighted graph,the accuracy rates reached 72.67% and 87.67% respectively.Compared with the traditional method using statistical features,the performance of the two algorithms improved by 10.11% and10.51% respectively after incorporating the weighted graph embedding features.2.Aiming at the problems of data isolation and privacy protection in the telecommunications industry,this thesis proposes a fraudulent call detection model based on vertical federated learning,which is used to realize the joint modeling of telecom data from two parties.Firstly,the method based on RSA encryption algorithm and hash function is used to align the two data samples,and then the two data samples are jointly trained based on Secure LR and Secure Boost algorithms respectively.The experimental results showed that the prediction accuracy rates of the vertical federation model in this thesis for the two algorithms reached 72.33% and 87.08% respectively.However,considering that it can reduce the risk of data leakage,the model is lossless. |