Font Size: a A A

Research On Key Technologies Of Security And Privacy Pretection For Big Data

Posted on:2019-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:F YanFull Text:PDF
GTID:2428330545458785Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technologies,especially in the fields of big data and artificial intelligence,it becomes easier to collect,store,publish and analyze volumes of data.From the perspective of data security and personal privacy protection,big data applications also bring a great risk of data security.One of most important issues is how to mine more values from big data and protect the privacy of data.Therefore,the key technologies of security and privacy protection for big data are researched in the dissertation.Firstly,the thesis makes an analysis of security issues in big data era in depth.The dissertation makes also an analysis of big data ecosystem model.And we divides the big data security into four respects that include data storage security,data access control,anti-data mining and data publishing privacy protection.Then,the security technologies applied in different stages are analyzed and studied.At last,according to the experimental requirements of the subsequent chapters,a data processing platform for big data security and privacy protection is set up and deployed.By analyzing of the process of big data publishing,some outliers exist in group partition.Outliers may lead to increased errors in publication results.And there are inefficiency problems in the determination of outliers.So we present a histogram publication method based on Spark framework with differential privacy.In the method,the histogram is divided and optimized by using the improved k-means algorithm.It achieves the rapid convergence of similar groups and realizes the optimal grouping fusion.After that,Laplace noise is added to the results so that data are released in the form of a hisgram with differential privacy.The experimental results show that the SPDP-GS method can improve the privacy and efficiency of publishing data in the volumes data releasing,and ensure the good availability of the published data.In order to solve the problem that the dynamic data needs to be released periodically,the thesis proposes a data flow histogram distribution method based on differential privacy.The work aims at hiding sensitive information while publishing,improving data utility and processing efficiency for dynamic data.We uses the fractal dimension to cluster datasets and counts values of each group.Firstly,we divide the dynamic data into a small piece by the sliding window technology.Then,the previously windowed information is dispatched to the module responsible for performing the fractal dimension analysis.And attributes counting is achieved in this period.At last,Laplacian noise is added to each packet.And the packets fusion optimization is performed to realize the differential privacy protection for data release of multidimensional dynamic datasets.The experimental results show that the proposed method can effectively deal with the privacy requirements of dynamic data and ensure the availability of published data.In summary,the dissertation makes an in-depth research on the key technologies of big data security and privacy protection.And we put forward two data publishing methods based on differential privacy for the needs of static data sets and dynamic data sets.Then,we present our experiments to evaluate the effectiveness and the usability of our proposed method on the used platform.Finally,the research shortcomings are explained,and future research concerns are analyzed.The results of the research can provide ideas for the research of big data security and privacy protection.
Keywords/Search Tags:big data security, differential privacy protection, Spark framework, histogram publication, fractal dimension
PDF Full Text Request
Related items