Font Size: a A A

Queries Of The K-anonymous Data

Posted on:2013-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:T T XinFull Text:PDF
GTID:2218330371956050Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the network information technology, information plays an increasingly important role in people's lives now. The industry needs to search for useful information from the massive amounts of data in order to meet their own needs. However, this practice may lead to the disclosure of individual privacy. For the sake of ensuring that the use of information to meet the needs of industry and personal privacy is not compromised, a variety of researches on searching for solutions to protect private information (such as daily habits, bad history, credit rating, past history, etc.) are started. Among them, the most representative of privacy protection model is K-anonymity privacy protection model. After years of research, K-anonymity privacy protection model has formed a complete theoretical system and will be increasingly applied to various fields. In order to protect private information, the K-anonymous methods imports the uncertainty of the data, but in traditional database applications the existent data are certain and accuracy. Because of the uncertainty of the K-anonymous data, the data storage, query, mining and management encountered problems. These anonymous data can't be good used in applications. Therefore, all the applications which related with K-anonymous data require immediately solution to enhance the usability of the K-anonymous data. The query is a major operation on data applications which needs a solution too.Due to the characteristics of the uncertainty data itself, combining with that today's popular database management systems are built on the basis of deterministic data, traditional query processing methods are not suitable for queries on uncertain data. Therefore, query processing on uncertain data become the hotspot of the research in recent years and through the efforts of many scholars, there have appeared many great uncertainty data query processing methods, each of them are suitable for specific application, but none of them recognized to be the all-powerful query methods.Therefore, according to the special nature of the uncertainty in the K-anonymous data, and the deferent presentation with other class data, we will design an efficient data storage model (multidimensional model) in order to store K-anonymous data in the existing database.Then, we will create index on the K-anonymous through looking for a reasonable index structure (R-tree) in order to improve query performance.Moreover, in order to improve the availability of K-anonymous data and to meet a wider range of application requirements, a class of K-anonymous data query methods will be explored. Two new queries are defined for the utility of K-anonymous data. They are UK-Rank and NT-Rank. UK-Rank is mainly used in queries that require sorting, NT-Rank applied to the point query or range query. The Monte-Carlo integration is used to compute accurate estimate of probability and to improve query efficiency.Finally, related experiments will be conducted. Experiments show that the translation from K-anonymous data to spatial data is feasible. The query efficiency was greatly improved after the application of sampling methods. Through the comparing between the query efficiency of the different amount of data, the results shows that with the growth in the amount of data query time-consuming grows in linear time.
Keywords/Search Tags:top-k query, k-anonymous data, uncertain database, partial orders, r-tree
PDF Full Text Request
Related items