Font Size: a A A

Research On The Key Technologies Of Privacy Protection For Real-time Data

Posted on:2022-01-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Q LeFull Text:PDF
GTID:1488306734950989Subject:Computational intelligence and information processing
Abstract/Summary:PDF Full Text Request
With the rise of the Internet of Things,various smart devices and systems(such as smartphones,smart bracelets,GPS,etc.)have been emerged everywhere in real life,and have also produced an explosive growth of data.Big data,which is made up of various kinds of multi-source data,has become a strategic asset with immeasurable potential value.Along with the maturity of big data technology,data is particularly important to promote the development of smart medical,intelligent transportation and personalized services.In order to break the barriers of data silos and fully use the value of data,the collected data can be publicly released to achieve data sharing,and can also be used for model learning to optimize algorithm performance.However,the data often contains a large amount of sensitive personal information,such as salary,disease records,and location information.If the data is released or used without secure processing,it may cause serious privacy leakage problems,and then leads to threats and harm to users' personal property and personal safety.Thus,study on the privacy protection of user data is crucial,in line with the needs of people and the country on data security,and is important for promoting the development and application of the digital economy.Real-time data records the users' time-series information,which can describe users' behaviors in more detail,and has been widely used in artificial intelligence,digital economy,livelihood services and other fields.Analyzing and learning based on real-time data can provide more personalized services and respond to environmental changes in a timely manner.However,there are some cases of repeated data release and dynamic data update in real-time data release,which increase the risk of privacy leakage and the difficulty of privacy protection.Based on this,this thesis researches key privacy protection technologies for real-time data,and the main research works include privacy protection of relational real-time data,privacy protection of spatio-temporal real-time data,and privacy protection of real-time data in a distributed learning environment.The main research results of this thesis can be summarized as follows.(1)Research on privacy-preserving algorithms for relational real-time data.For the privacy problems in relational real-time data release,an anonymous privacy protection algorithm(PMF)based on m-signature and fuzzy processing is proposed.In PMF,the concept of m-signature is introduced so that each bucket has at least m different sensitive values and does not produce any counterfeit tuples as a way to resist difference attacks and to improve the practicability of the data.Besides,the buckets satisfying m-signature are time-varying.This flexibility of m-signature improves the efficiency of PMF to complete update operations such as insertion,deletion and modification in real-time data.Meanwhile,PFM uses fuzzy processing techniques to handle the tuples in the candidate list and uses greedy heuristics for real-time data update operations to solve the trade-off problem between the utility of publishing data and privacy security.Finally,the security analysis of PMF is compared and experimental simulations are conducted based on the adult census dataset and the common disease dataset.The experimental and analytical results demonstrate that PMF can guarantee high practicability and privacy protection in real-time data publishing;(2)Research on privacy-preserving scheme for spatio-temporal real-time data.The key issues that how to ensure users' privacy security and data utility in real-time traffic flow publishing with spatial characteristics are studied,and a differential privacy protection scheme(DP-SCR)based on spatial correlation is proposed.The scheme provides the privacy protection of w-event ?-differential privacy and achieves accurate prediction of traffic flow based on spatial correlation.Subsequently,in DP-SCR,the traffic flow prediction based on spatial correlation is used to complete the sampling process,so as to achieve the adaptive allocation of the privacy budget.It can effectively reduce the noise introduced by differential privacy and improve the utility of data when users' privacy needs are met.Meanwhile,a dynamic clustering method based on a bisecting k-means is used to reduce the perturbation error caused by traffic flows with small values.In addition,the correlation and security-based analysis shows that the correlation of traffic flow on spatial features is more significant,which is beneficial to the prediction accuracy,and DP-SCR can provide a strong privacy protection.Finally,the experimental simulations are conducted based on the vehicular mobility dataset,and the experimental results further demonstrate that the proposed DP-SCR scheme outperforms the related existing schemes in terms of data utility;(3)Research on privacy-preserving algorithms for real-time data in distributed learning environments.A federated continuous learning algorithm(FCL-BL)based on broad learning is proposed to address the key issues of real-time data privacy protection in federated learning and how to improve model performance.In FCL-BL,a localindependent training scheme and a batch-asynchronous approach are designed to achieve efficient and accurate model training.In addition,a weighted processing strategy is proposed to address the problem of catastrophic forgetting in federated continuous learning.Finally,the theoretical analyses and the experiments based on four databases,MNIST,NORB,FASHION,and SIGNAL,further demonstrate that FCL-BL can fully guarantee the privacy security of users and significantly improve the efficiency and prediction accuracy of model training in handling distributed learning based on real-time data.
Keywords/Search Tags:Real-time data, Privacy protection, m-signature, Spatial correlation, Federated continuous learning
PDF Full Text Request
Related items