Font Size: a A A

Research On Outlier Detection Algorithm Based On Differential Privacy Protection Model

Posted on:2021-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y YanFull Text:PDF
GTID:2518306554465504Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of data sharing and mining technologies,data leakage problems are emerging one after another,and data privacy information protection issues have attracted much attention.Outlier detection research for privacy protection needs to accurately detect anomalies while taking into account the privacy of the detected data.Based on this requirement,this paper proposes two outlier detection algorithms based on the differential privacy protection model.The main research contents are as follows:1.For complex manifold data sets,the similarity measurement using Euclidean distance cannot accurately describe the data relationship,and it is vulnerable to background knowledge attacks and privacy information leakage during outlier detection.This paper proposes a connectivity outlier detection algorithm based on differential privacy protection model.The algorithm first describes the similarity between data samples through a k-similarity path based on connectivity,and hides the relationship between data by adding noise to the similarity matrix,reducing the sensitivity of the data to differential privacy noise while balancing Data security and availability.Then calculate the abnormality of the data based on the dissimilarity of the connectivity and the number of k-renverse similar neighbors.Finally,cluster the data according to the connectivity of the data in k-similar paths.When judging the final outlier of the clustering result,an outlier decision criterion for adaptively identifying the amount of anomalous data is proposed to eliminate the effect of artificially setting parameters.2.Aiming at the problem that the sparseness of data samples in high-dimensional space is not easy to characterize and privacy data leakage is easy to occur in the detection process,a spectral mapping outlier detection algorithm based on differential privacy protection model is proposed.The algorithm first constructs a sparse adjacency matrix based on the local relationship of the data,and obtains an inaccurate data relationship that retains the statistical characteristics of the data through the differential privacy model,which improves the problem of excessive noise caused by the addition of differential privacy protection.Then calculate the Laplacian matrix and eigenvalues,adaptively determine the number of categories based on the maximum difference between adjacent eigenvalues,and select the subspace data samples after dimension reduction,thereby reducing the influence of the number of preset categories on clustering,enhance the controllability of the clustering algorithm,and ensure the optimal division.Finally,according to the outlier factor proposed in this paper,the outliers of the data in the cluster are judged to determine the true outlier data,which improves the detection performance of the outlier detection algorithm in high-dimensional space.Through simulation data set and UCI data set experiments and comparison with the classic outlier detection algorithms,it has obvious advantages in evaluation indicators such as detection rate and misjudgment rate.The privacy protection index shows that it can effectively protect data privacy.
Keywords/Search Tags:Privacy protection, Outlier detection, Differential privacy, Spectral mapping, k-similar path
PDF Full Text Request
Related items