Font Size: a A A

Research On MapReduce Secure Data Exchange Based On Trusted Execution Environment Technology

Posted on:2022-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2518306602992969Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud computing is a resource sharing model based on virtualization technology.It provides tenants with a low-cost customized service to obtain public cloud virtualization resources.The Map Reduce distributed framework is widely used in public cloud platforms as a parallel programming model for large-scale data processing.Because traditional public cloud platforms lack visibility to users,there may be a threat of sensitive data leakage during the execution of Map Reduce programs,which seriously affects the confidentiality of users' cloud data.This paper focuses on the protection of data confidentiality under the Map Reduce framework in public cloud services,and proposes two Map Reduce secure data exchange solutions based on trusted execution environment technology,which solve the problem of leakage of associated sensitive data between distributed computing nodes.This paper first designs and implements a distributed cluster architecture based on trusted execution environment technology,which provides a trusted execution area for user sensitive data and programs through trusted hardware.Based on this architecture,this paper studies in depth the data exchange solutions between Map Reduce distributed computing nodes.The existing technology uses a trusted execution environment to ensure the confidentiality and integrity of data and programs in distributed nodes,but the confidentiality of data exchange(shuffle)between nodes still faces the threat of side-channel-attacks.This paper has deeply studied the existing data exchange solutions(such as Shuffle-In-The-Middle,Shuffle&Balance),and discovered the shortcomings of the existing solutions in terms of security,applicability,and performance,etc.In view of the security threats and deficiencies in existing research,based on the security requirements in different application scenarios,under the premise that the upper limit of the key value types in the Map Reduce job is known,this paper defines a channel correlation hidden model between trusted nodes,and proposed a distributed data shuffle solution between nodes based on attribute threshold.Compared with the existing Shuffle-In-The-Middle solution,this solution reduces the performance overhead by about 69%,reduces the performance overhead by about 58% compared with the Shuffle&Balance solution,and generates about 36.5% additional overhead compared with the standard Map Reduce distributed tasks.In addition,in the application scenario where the upper limit of the key value type in the Map Reduce job is unknown,this paper proposes an improved solution for data shuffle between distributed nodes based on Bloom filter to improve the universality of the solution,and realize the de-association shuffling between the channel information and the input data set.The experimental results show that the improved data shuffle solution reduces the performance overhead by about 25.8%-37% compared with the existing basic shuffle solution,and will generate about 102% additional overhead compared to the standard distributed inter-node shuffle solution.The above two data shuffle solutions between distributed nodes,under the premise of generating acceptable performance overhead,aim at the threat of side-channel-attack between the Map Reduce nodes,and protect the confidentiality of data interaction in the distributed computing process.Based on the technologies and methods proposed above,this paper proves through theoretical analysis that the two data shuffle solutions meet the definition of the Map Reduce security model.Through experimental comparison and analysis,the solutions in this paper guarantees the confidentiality of user data in distributed computing with a certain performance overhead.two solutions solve the limitations and deficiencies of existing solutions in terms of security,applicability and performance,and achieves the purpose of resisting side-channel-attack between nodes.
Keywords/Search Tags:Public cloud, MapReduce, Trusted execution environment, Data confidentiality, Side-channel-attack
PDF Full Text Request
Related items