Font Size: a A A

Research On Data Security Safeguard Mechanism Based On Hadoop

Posted on:2021-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Q WuFull Text:PDF
GTID:2518306473974519Subject:Information security
Abstract/Summary:PDF Full Text Request
As a typical open source distributed data storage and processing framework,Hadoop has become one of the tools for the commercial processing of big data.The application and development of the Hadoop platform faces many key issues,among which Hadoop security has become the focus of attention.The Hadoop platform has security vulnerabilities in the process of distributed data storage and parallel processing.Malicious users use the security vulnerabilities to obtain data or attack the platform,threatening the platform's sensitive data and personal privacy.This article analyzes and studies the security threats and mechanisms of the Hadoop platform.The main works are as follows:(1)The existing security mechanisms and security components of Hadoop are analyzed around the four aspects of authentication,authorization,encryption,and audit.At the same time,the network attack methods that the Hadoop platform may be subjected to are analyzed,and attack detection methods are proposed in a targeted manner.(2)According to the structural characteristics of the core component HDFS,the security threats that may exist when no security mechanism is applied are analyzed,and an optimization scheme is proposed for the HDFS transparent encryption technology.The optimization scheme mainly designs the security of the key management server KMS to ensure the data security of HDFS and KMS.The security design verifies the identity by adding a hybrid identity authentication mechanism;adding interfaces at each end to reduce the key management load after KMS implements HTTPS secure transmission,thereby strengthening the protection of keys;setting ACL access control the list implements fine-grained access authorization for users.The experiments design and realize the required safety function,and prove the safety protection of KMS and HDFS data by this scheme through safety analysis.(3)According to the structural characteristics of the core component Map Reduce and the data processing mechanism to analyze the security threats without the application of a security mechanism,a Map Reduce parallel encryption scheme based on Salsa20 is proposed.In this solution,the Map Reduce task processes the data stored in the HDFS,and uses the stream cipher algorithm Salsa20 to encrypt and decrypt the data block before the data reaches the Data Node for storage,rather than encrypting and decrypting the entire file,using the features of Map Reduce parallel operation to achieve the effect of parallel encryption and decryption,and then to improve data processing performance.The parallel encryption system is realized through programming.The system is mainly divided into an encryption module,a decryption module and an algorithm module.Salsa20 is written into the algorithm module and waits for the encryption module or the decryption module to call.The encryption algorithm of this parallel encryption system is not limited to Salsa20.The encryption system can be added with encryption algorithms as needed,which is extremely flexible and extensible.According to experiments and data analysis,the system can ensure the security of Map Reduce data while ensuring data processing efficiency.
Keywords/Search Tags:Hadoop, security threat, security mechanism, HDFS, Map Reduce, encryption
PDF Full Text Request
Related items