Font Size: a A A

Research On Security And Privacy In The Big Data Era

Posted on:2019-05-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:X YaoFull Text:PDF
GTID:1318330542472269Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the development of the cloud storage,the rise of the social network,and the popu-larity of mobile devices,human being enters into a big data era.Compared with the tradi-tional database,the structure and style of big data have become diversity.From the aspect of data structure,the structure of big data can be divided into structured,semi-structured,and unstructured.From the aspect of data style,big data commonly includes context,numerical value,picture,video,emoji,audio.Besides,data are generated by various platforms,such as social network,microblogging systems,mobile apps,wearable devices,personal health record system.However,data security and privacy have been an obstacle to developing big data.How to utilize the advantage of big data,and also guarantee data security and privacy becomes one of the urgent issues.In this paper,we focus on the research on information security and privacy in the Big Data Era.The main contributions of this study are listed as follows:(1)We aimed at the deduplication service in the cloud storage system to do security and privacy research.To support privilege-based duplicate checks and prevent privilege-based user profiling by the cloud server,we proposed a novel Hierarchical Privilege-Based Predi-cate Encryption(HPBPE)scheme.Furthermore,to support the practical dynamic privilege changes(e.g.,promotion and demotion),we proposed a Hierarchical Privilege-Based Pred-icate Encryption with Revocation(HPBPE-R)scheme.Besides,the rigorous security and performance analysis proved the security and efficacy of our schemes,and the experimental results showed that our schemes are effective and efficient.The scheme workflow includes:first,the users are allowed to split data files into multiple data blocks and compute the fin-gerprint for each block.Then,data users generate the trapdoor,which will be outsourced to the deduplication provider.Through matching the trapdoor with the existing encrypted fingerprints stored in the deduplication provider,the scheme determinate whether the cloud storage stores the same data block with the corresponding privilege or not.If success,the deduplication provider requests the position of the data block and responds it to data users.Otherwise,data users encrypt the data block and its fingerprint as well,and outsource them to the cloud storage and the deduplication provider,respectively.(2)We aimed at the Cloud-Based Personal Health Record(CB-PHR)to do security and privacy research.To support merging encrypted indexes with distinct keys,we proposed a novel multi-source order-preserving symmetric encryption(MOPSE)scheme.To satisfy the requirement of the hierarchical authentication query,our scheme further assigns different privileges to data providers,in which data providers with a higher privilege can query the data outsourced by someone with a lower privilege.To facilitate this function,we proposed an enhanced scheme,termed by MOPSE~+.Besides,the rigorous theory analysis proved that MOPSE and MOPSE~+are both security and efficiency,and the experimental results showed that our schemes meet the practical requirements.CB-PHR system consists of three entities,such as data provider,data owner,and the cloud.The main workflow includes:first,data provider adopts B-tree to build the index for the personal health records of each data owner.Then,data owner authenticates distinct data providers encrypt his/her data and indexes with different keys,and outsource them to the cloud.When receiving multiple encrypted indexes of the same data owner,the cloud can merge them without decrypting indexes,and split the merged index into two independent indexes for the query processing of data providers and data owners,respectively.(3)We aimed at the problem that the third-party social data provider returns untrust-ed results to do verifiable research.To allow data consumers verify the correctness and completeness of returned results,we proposed a verification scheme based on Merkle Hash Tree(MHT).However,the number of signatures depends on that of users.To further re-duce the computation overhead,we proposed an enhanced scheme,in which the number of signatures relates to that of unique attribute values.Besides,to reduce the memory over-head,we introduced the Bloom filter technique to propose an advanced scheme.Although the advanced scheme is a probability scheme,it can detect the untrusted activities with a high probability(even minor modification).Our security and performance analysis proved that our schemes are secure and efficient.The experimental results on real Twitter dataset demonstrated the efficiency of our schemes.The system model derives from the real social data outsourcing service and includes online social network,social data provider,and data consumer.The workflow of our system includes the following steps.First,the online social network generates auxiliary information for its social data and outsources them attached with original social data to the social data provider.When receiving the purchase requests issued by data consumers,the social data provider searches the corresponding social data and returns the results to data consumers.Meanwhile,the social data provider generates the verification object for the query results.Subsequently,data consumer verifies the cor-rectness and completeness of the query results with the verification object and the query results.(4)We considered user locations leakage problem in Venmo.We proposed a Multi-Layer Location Inference(MLLI)technique,which can infer the locations of Venmo users with the transaction records and the mandatory transaction notes.Intuitively,many Venmo transaction notes contain implicit location cues,and the types and temporal patterns of user transactions have strong ties to their location closeness.The experimental results with a real Venmo transaction show that MLLI can identify the top-1,top-3,and top-5 possible locations for a Venmo user with accuracy up to 50%,80%,and 90%,respectively.The main workflow of our attack includes:first,we use text mining algorithms to obtain the keywords for each transaction note.Since distinct keywords have different location relevance,we further divide the keywords and the corresponding transaction records into four categories,where the lower-numbered category corresponds to higher location relevance.Second,we construct an undirected weighted trans-action graph for each category,in which each edge corresponds to two users with any transaction history in that category,and the edge weight depends on their transaction pattern.For example,more intense and consistent transactions should translate into higher edge weights than occasional ones.Third,we identify a small set of users as seeds whose locations can be directly obtained from their geotagged Venmo transaction notes or via external means.Then we propose an iterative multi-layer belief propagation scheme to propagate the location beliefs to non-seed users in each category.Finally,we perform a weighted combination of the location beliefs for each user in the four categories and assign the most probable home location to each user.
Keywords/Search Tags:Cloud Storage, Personal Health Record, Social Network, Mobile Payment, Authenticated Search, Authenticity Verification, Location Inference
PDF Full Text Request
Related items