Font Size: a A A

Research On Privacy Preserving Data Mining Of Mobile Internet User Behavior

Posted on:2022-09-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:K YuFull Text:PDF
GTID:1488306326479714Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of 5G mobile communication,smart phones and Internet technology,mobile Internet can meet the needs of users to obtain information resources anytime and anywhere.In the face of the growing demand for information services and massive user behavior data,data mining is usually used to discover the potential value and behavior rules of users,and promote the development of health care,intelligent transportation,big data credit and other fields.In the behavior database of mobile Internet user,there were rich personal privacy information,such as location trajectory,consumer credit,points of interest and so on.When the information was over collected and accessed,it will increase the risk of personal privacy disclosure.Privacy preserving data mining(PPDM)can meet the needs of data mining and privacy protection.It was an optimization technology of data mining algorithm.PPDM technology can hide the information content that users can't disclose,so that the original data can't be peeped or attacked,and ensure the privacy of user data.After privacy protection,it doesn't change the statistical characteristics of data,and meets the needs of mining algorithm for data availability.This paper focuses on privacy preserving data mining of user behavior transaction items,sequences,locations and contexts.The main contributions of this thesis were summarized as follows:(1)Aiming at the problem of privacy preserving data mining for frequent patterns of user behavior transaction itemsets,a privacy preserving frequent itemsets mining algorithm DPFIS based on FP-Tree was proposed.The algorithm was divided into two stages:data preprocessing and data mining.In the data preprocessing stage,in order to improve the efficiency of data reading and protect the privacy of user transaction itemsets,a frequent pattern tree DPFP-Tree access structure satisfying differential privacy protection was established by using FP-Tree structure.In the mining stage,in order to reduce the interference of noise to the data,the interference noise allocation mechanism of transaction item support and item set length was designed,and the addition of exponential noise was controlled by the numerical value of scoring function.In addition,the relative threshold length splitting method was used to reduce the impact of truncation error of transaction itemset length,and improve the accuracy of mining results.In this paper,the privacy preserving scrambling strategy was better than the contrast algorithm in frequent itemsets mining.Comprehensive experimental results show that F-score and RE were increased by 14%and 17%respectively.(2)Aiming at the problem of privacy preserving data mining of frequent patterns in user behavior transaction sequences,a privacy preserving frequent sequence mining algorithm DPFSC based on FS-Trie was proposed.In order to solve the redundancy problem of candidate frequent sequence sets,a frequent sequence prefix tree FS-Trie access structure satisfying differential privacy protection was established by using prefix tree structure.In order to reduce the branch height of the subtree,we design the scoring function mechanism of sequence length,optimize the privacy budget allocation strategy,and cut the subtree branches beyond the optimal height.In addition,in order to maintain the coherence of sequence items in DPFS-Trie,a pruning splicing strategy for prefix tree branching was adopted to compensate the information loss caused by sequence truncation,so as to improve the accuracy of sequence frequent pattern mining results.In this paper,the sequential prefix tree structure adjustment strategy was better than the comparison algorithm for frequent itemsets mining.Comprehensive experimental results show that F-score and RE index values are increased by 11%and 15%respectively.(3)Aiming at the location privacy data mining problem of user mobile behavior,this paper proposes a new method of privacy protection traffic prediction based on queuing theory EM-PMM.Aiming at the problem that the improvement of location service quality was difficult to coordinate with the privacy protection of personal location,a privacy protection scheme based on grid fuzzification and precise location is designed by using geohash technology.In order to solve the problem that it was difficult to predict the flow caused by the mobile users,a prediction model of the number of personnel flow was proposed for the service resource allocation in the target area.In addition,in order to train the model parameters which obey Poisson distribution and estimate the number of people flowing in the target area,an EM-PMM method for people flow prediction was proposed.In this paper,the prediction effect of the mobile state model was better than that of the contrast algorithm.The comprehensive experimental results show that the RMSE and RE of new flows were increased by 4.5%and 2.3%respectively,and the RMSE and RE of end flows were increased by 3.1%and 2.7%respectively.(4)In order to solve the problem of context privacy preserving group activity recommendation when users participate in social activities,a privacy preserving group activity recommendation algorithm DP-SCTM based on context topic model was proposed.The algorithm combines time,space,content and social relations factors to meet the needs of users to participate in group activities in real time,and alleviate the problem of sparse data and cold start.In order to protect the user context privacy,a context theme model is designed to meet the differential privacy protection.The training results of noise interference model were added in the Gibbs sampling process.In addition,in order to solve the problem of context privacy protection in group activities,the privacy budget allocation scheme was optimized,and the noise addition of interference model training parameters,user interest preference portrait and activity candidate list was controlled.The proposed context theme model algorithm is better than the comparison algorithm in the recommendation of group activities.The results of the experiment show that Precision and Recall index value were increased by 16%and 18%respectively.
Keywords/Search Tags:User behavior mining, Differential privacy, Frequent pattern mining, Group envet recommendation
PDF Full Text Request
Related items