Font Size: a A A

Research On Privacy Preservation In Personalized Search

Posted on:2013-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2248330395463143Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The traditional search technology can provide a wealth of information for users, but it also brings too much noise and redundant data to them at the same time. It is difficult for users to distinguish the information in which they are interested. The technology of personalized search emerged in order to meet the individual needs of users and make them get more intimate and intelligent experience when they search on the Internet. And it can provide users with personalized search results. The privacy of users may be leaked, because the implementation of the personalized search technology needs to collect and use users’ personal information. At present, it is a hot topic of research that how to use more effective technologies to protect the security of users’ privacy. This thesis has researched on privacy preservation in personalized search and the main contents of it are as follows:Firstly, the thesis reviews the research status of personalized search technology, privacy preservation technology and privacy preservation technology in personalized search, and points out an issue which needs to be further solved and researched in personalized search. There is a threat of privacy leak. The thesis studys the basic concepts and theoretical basis of the traditional VSM (Vector Space Model) and analyzes TF-IDF (Term Frequency and Inverse Document Frequency) weighting algorithm.Secondly, the thesis states the concepts of Secure Multi-party Computation and its several basic protocols which are commonly used, and describes secure dot-product protocol in detail, and analyzes the definition of security and security requirements about Secure Multi-party Computation. The definitions of secure two-party computation and Secure Multi-party Computation in the semi-honest model are stated respectively.Thirdly, the traditional Vector Space Model using TF-IDF weighting algorithm has an insufficiency in quantifying the network documentation, so an improved Vector Space Model is put forward. The model analyzes the semantics of the network text content, and extracts UCL semantic grid. After analysising semantic hierarchy, each term gets a weighted weight of semantic grid, and quantifies the document combining with TF-IDF weighting algorithm. Both weighted weights of semantic grid and terms’ own weighted weight are taken into account by the improved model, so the text content can be expressed more truly.Finally, an algorithm to compute secure dot-product similarity based on the improved Vector Space Model is presented. It aims to avoid the leak, of users’privacy during the server using users’profile in the sorting process of personalized search results. The algorithm introduces secure dot-product computation into the improved Vector Space Model which is above-mentioned. The original search results are transformed into space vectors which can express their contents more truly in the improved Vector Space Model. According to privacy preservation features of secure dot-product computation, users’personal information can be avoided being exposed to the server during the similarity calculation process between the original search results and users’ profile. The algorithm can provide high-quality sorting of search results, but also protect users’personal privacy. Compared to the traditional sorting methods of personalized search results, the results show that the algorithm can effectively protect users’privacy in the sorting process of personalized search results.
Keywords/Search Tags:privacy preservation, personalized search, Vector Space Model, securedot-product computation, similarity computation
PDF Full Text Request
Related items