Font Size: a A A

Research On Privacy-preserving Anonymization Techniques For Social Network Data

Posted on:2018-07-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H LinFull Text:PDF
GTID:1368330515453664Subject:Artificial Intelligence
Abstract/Summary:PDF Full Text Request
As the electric informationization and Internet develop rapidly,it is very convient to achieve and share some information through the web,while Internet companies which provides social network services are eager to understand what the online customers need.At present,many organizations including social networking sites are collecting social network data,most of which contain users;sensitive information.Due to insufficient analysis capabilities or some other special requirements,some information is required to be published or delivered to the other organizations or individuals.If the original datasets containing users;sensitive information are released,it will certainly violate individual privacy.Anonymization,which makes data publishing safe from the possible leakage of sensitive information,is a powerful solution to the privacy issues.The dissertation starts from the home and abroad analysis on privacy protections of sensitive information;briefly overview some anonymity models including k-anonymity,a lot of k-anonymity methods and some evaluation measures.Thus,the author investigate several privacy issues in social network data,and share some problems and solutions as follows:a)The dissertation studies anonymity methods for collaborative social network data.The anonymity methods based on semantic taxonomy tree have a common problem that it requires definitions and adjustments in a manual mode.However,due to multiple choices of semantic classification or incapabilities of abstract expression,the anonymous data which are anonymized with semantic taxonomy trees have a big information loss along with much reduction of data utility.To overcome this shortcoming,this dissertation proposes a novel anonymity method based on utility-based taxonomy tree.According to the experimental results of multiple comparison tests between the utility-based taxonomy tree and the semantic taxonomy trees oriented from some classical anonymity methods,the anonymity results generated by the utility-based taxonomy tree are better than the corresponding results of the semantic taxonomy tree,which indicates that utility-based taxonomy tree improves the effectiveness of anonymity methods.b)The dissertation studies anonymity methods for large-scale collaborative social network data.The existing anonymity methods based on centralized non-parallel tree generalization(Tr_Ge)share a bottleneck of computing and storage performance.To combat the problem of algorithmic efficience,this dissertation proposes a parallel anonymity(MR_Tr_Ge)applied on cloud computing platform and provides two improved algorithms,namely pre-processing partition algorithm and parallel tree generalization algorithm.According to the experimental results of multiple comparison tests between the parallel tree generalization algorithm and some existing centralized tree generalization algorithms,the improved method proposed in this dissertation has a significant advantage in running time.c)The dissertation studies anonymity methods for correlational social network data.Most of the existing k-anonymity methods which employ clustering idea,have a significant bottleneck of computing performance during iterative optimization procedures and cannot satisfy the need of anonymizing large-scale social network data.This dissertation investigates the "locality" characteristic of individual relations,and proposes a generalized anonymity method,which is based on the elimination tree from the "locality" feature extraction and association clustering.In order to divide a complete user community into several anonymous equivalence classes that satisfy the requirements of k-anonymity model,existing methods often use some clustering approaches to achieve this goal.However,This dissertation uses the elimination tree to extract the features,and then simplify the graph generalization problem to a tree generalization problem.Thus,the intermediate results of tree generalization is substituted into the algorithm of graph generalization to get final results.Experimental results show that,the anonymity method proposed in this dissertation has a significant efficiency advantage over the existing clustering-based anonymity method,and is suitable for the anonymity problem of large-scale social network data.d)The dissertation studies anonymity methods for complex correlational social network data.Existing anonymity models and methods are mostly considering social network data as unweighted undirected graphs or undirected graphs containing limitted weights.So these models and methods have some limitations,and cannot satisfy the privacy-preserving requirements of some complex relations,such as directed graphs,weighted graphs and negative weighted graphs.To solve the problem,this dissertation proposes an adaptive k-anonymity model and three different anonymity methods of graph generalization,to alleviate the risks of privacy breaches for publishing such complex data structures.The anonymity model and these methods are also eligible for an extension of k-anonymity model in the field of privacy preservation.The experimental results on the multiple independent datasets show that the proposed method has good effectiveness.Based on the method,a reliable anonymization service can be applied for publishing social network data.In Conclusion,the dissertation studies the publication and sharing of social network data,propose several improved anonymity methods for different privacy preservation requirements and application environments,and realizes the protection of individual information disclosure in pravacy data publication.
Keywords/Search Tags:Social Network Data, Privacy Protection, Anonymization, Clustering
PDF Full Text Request
Related items