Font Size: a A A

Research And Implementation Of Spatio Temporal Data Sharing And Query Privacy Protection Based On HDFS

Posted on:2015-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:F S MengFull Text:PDF
GTID:2308330482957259Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of science and technology, the informatization of social life has been extended to more and more feilds. A large amount of data are produced everyday, and how to store and query these data effectively has important implications in the future. In recent years the development of cloud computing and cloud storage provides effective ways to store and query mass data. Therefore, more and more applications are moving to the "cloud". This work is based on the National Ocean project "the computation and service framework towards ocean environment information". The massive spatio temporal data in the State Oceanic Administation is separately stored and the management is complex. Through analysis of the situation, the thesis implements a data sharing system of spatial data based on HDFS. In order to meet the demand that user query information are not leaked when querying, this thesis proposes privacy protection query algorithms towards spatio temporal data.The framework of the data sharing system is divided into three layers:the base layer, the interface layer and the service layer. The base layer is realized by Apache MINA, Java, Mysql, HDFS, and they are responsible for communication, data transmission, user information storage and data storage respectively. The interface layer is responsible for the management of basic level resources and provides service interfaces to the service layer. This layer is divided into data transmission module, user management module and file management module. The service layer is responsible for the business logic of the system. This layer analyses users’messages and calls service interfaces provided by interface layer to execute the requests. The system specifies storage space for each user, and enables multiple users to use the same storage space to share the internal files at the same time.The existed technology will leak the users’ query privacy when Oceanic applications query the ocean spatial information, and therefore this thesis proposes SCPIR-V algorithm based on CPIR-V algorithm. This algorithm can reduce the computational cost and communication cost by finding the relationship between different nearest neighbor points sets and compressing the matrix size. By using this algorithm the server can return the query results without learning any query information. The experimental results show that the SCPIR-V algorithm can improve query performance obviously with the real and gauss data, but the uniform data slightly. The CPIR algorithm is applied into spatial range query, and a grouping algorithm is proposed on the base of naive space range query algorithm. The experimental results show that the query algorithm based on grouping is better than naive query algorithm with regards to the server and client computation time, but the algorithm does not increase the commucation cost obviously.The volume of data will grow while time goes on, and the existing privacy protection query algorithms can not meet users’ demand. Through analysis of this shortage, this thesis proposes privacy protection algorithms based on CPIR for the temporal information. This thesis analyses the basic characteristics of temporal information. Then, this thesis optimizes the naive time query algorithm and proposes dynamic adjustment algorithm and dynamic hash algorithm. By adjusting the number and arrangement of the data sets, the server cannot find anything of the query. Finally, those three algorithms are applied to the spatial range query. When querying for a single moment, the dynamic hash algorithm is more effective. In the range query the dynamic hash algorithm has less server time with the communication cost and client time incressing less, and with the development of communication bandwidth and client computing ability, the dyanmic hash algorithm would have a better performance.
Keywords/Search Tags:data sharing, privacy protection, spatio temporal data, HDFS
PDF Full Text Request
Related items