Font Size: a A A

Research On Computational Integrity Of Open Mass Data Processing Service

Posted on:2015-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:1108330479979531Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Open mass data processing service plays an important role in Big Data processing.Since open services are under threat from the intentionally compromised service providers and the certain security leaks within the distributed computing framework, the computational integrity of service becomes an important issue. Current researches on this issue are mainly focused on the interior computing framework of mass data processing, using replication-based mechanism to verifiy the computing results of worker nodes, to guarantee the computational integrity. While, replication-based technique introduces large overhead, and these interior verification schemes on the paticipanting nodes can not effectively deal with the integrity violation by intentional cheating of service providers.The computational integrity issue of open mass data processing service is studied from two aspects: integrity verification and integrity assurance. Integrity verification is a post-mortem, from a user standpoint, to check the integrity of computational procedures and results supplied by service provider, while integrity assurance is an active protection by service provider that exploits trusted computing resources to ensure the computational integrity. We take Map Reduce, the dominant mass data processing model, as the main research object and take the features of mass data processing into account. Aiming at lowering down the overhead of integrity verification and strenthening integrity assurance,this dissertation systematically investigates a series of research issues concerning computational integrity of open mass data processing services, and manages to improve the usability and efficiency of the proposed schemes. The main contents and contributions of this thesis are as follows.First, the third-party based computational integrity verification is investigated. In the model of cloud computing, to construct a controllable supervision system on the security and quality of cloud services is a major challenge in trusted cloud services research,and the third-party based audit one of the important methods. In Map Reduce model,the computing in Map phase, which deals with original input, is a critical part. We propose a machenism of trusted sampling based computational integrity verification on Map phase. The third-party verifier deputizes for the user to exploit trusted sampling check on the intermediate results of Map Reduce and verify the computational integrity in Map phase with a minor overhead. Moreover considering of the intentional cheating of serviceprovider, the Merkle tree technique is used to organize the veirified infromation, to ensure the authenticity of sampling.Second, we further investigate the verification exploited by the users themselves. Before a cloud service supervising system is well established, self-verification performed by users, which is transparent to the service provider, is another effective approach. We study the method of computational integrity self-verification with monitoring probes. According to the type of Map Reduce tasks, a set of monitoring probes are constructed and injected into the input data set. By observing the results of probe data, the computational integrity of the task can be concluded with a certain probability. The proposed scheme is relevant to the type of computing tasks, therefore, our reseach focuses on modeling the method and studying on its characteristics, and proposes probe constructing methods for several typical computational problems of Map Reduce model. The porposed scheme verifies the integrity of both Map and Reduce phases, with no requirement of service providers’ support. Meanwhile, the sampling-based principal makes the verification overhead acceptable.Third, the constructon of trusted Map Reduce system in open environment is investigated. In terms of service providers, when open computing resources from various trust domains are exploited, the computing results of all nodes need to be verified before accepted. The technique of replication and voting is adopted to address this issue in existed reseaches. As to defending collusive attacks, however, these solutions are lack of effectiveness and efficiency. We propose a scheme of contructing anti-collusion Map Reduce system in open environment, which uses no extra re-computation but historical replication information on the tasks to detect both collusive and non-collusive attackers. Integrity Attestation Graph is designed to describe the verification relationship between the worker nodes and then the malicious attackers are precisely identified through maximum clique problem analysis. Furthermore, a heuristic algorithm for verification pairs seclection based on Integrity Attestation Graph is proposed, to improve the efficiency of malicious nodes identification.Finally, the trustworthiness assessment for computing nodes is investigated. Numerous nodes are involved in the mass data processing environment, and thus replicationbased verification on all nodes will induce gaint overhead. If the trustworthiness of nodes can be evaluated in advance with a minor cost, the replication based veirification can be done only to those less trustworthy nodes, and the computing efficiency can beimproved greatly. We propose the method of trustworthiness assessment based on monitoring probes. The results of probe data are used to make sure whether the probes have been correctly computed, and their execution path can also be uncovered based on the priciple of Shuffle phase in Map Reduce. Finally a reputation mechanism is used to evaluate the trustworthiness of nodes. Our proposed trustworthiness assessment scheme is implemented at application level, and need no modification to the computing framework.Based on the assessment, only the less trustworthy nodes will be verified with highly accurate replication based test. Thus, the resource requirement is actually lower down.In summary, this thesis investigates several important issues on the computational integrity of open mass data processing services, and proposes a series of solutions with high usability, high detection rate and low overhead. Theoretical analysis and extensive simulations demonstrate the effectiveness and performance of the proposed methods. Thus,the design of this thesis is of great theoretical significance, hopefully advances the development of trusted open mass data processing services.
Keywords/Search Tags:open mass data processing service, computational integrity, MapReduce, third-party verification, probe injection, anti-collusion, node trustworthiness assessment
PDF Full Text Request
Related items