Font Size: a A A

Research On Data Security Protection Mechanism Based On HDFS

Posted on:2017-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:L Q SunFull Text:PDF
GTID:2308330485484545Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As data has become an important economic resource, Pepole concentrate on its security, privacy and the proper use. However, currently the main mechanisms about big data security and privacy at home and abroad, including authentication, access control, intrusion detection, and security information and event management, did not track the lifecycle of data and user behavior accurately. Through recording the ecolution of data products, provenance formed metadata which accurately expresses the characteristics of data and contains object history information, providing guidance for the complex data analysis and interpretation of data systems.In database systems and distributed file system HDFS as the data-intensive system, database table and HDFS file for the study, the thesis proposed to apply data provenance to big data security protection solutions, after in-depth analysis of provenance systems and models at home and abroad and exploration of issues and challenges of big data provenance, and designed the specific implementation of big data security provrnance system.The big data security provenance system designed to achieve in this thesis consists of provenance collection and standardized component, provenance compression component, provrnance storage component, and provrnance visualization analysis component.The provenance collection and standardized component designed a compatible data standard collection format for the diversity of source data systems and realized secure remote data acquisition, data versioning control, and complete mapping from audit records to models. The provenance compression component put forward an efficient compression algorithm based on semantic for efficient metadata storage problems. Through in-depth analysis of the records feature of data systems, storage space could be saved by identifing invalid data characteristics and filtering it. Identifying the basic characteristics of repeat or similar action, a lot of similar operations at the same time could be merged, which lightened the system load and reduced complexity of data analysis to optimize the speed of data visualization. The provrnance storage component designed an extensible scheme for storing metadata, which supports different database systems access, data backup and migration between structured data and unstructured data. The provrnance visualization analysis component realized data access based on users, files, processes, and other aspects based on graphic data storage of big data, as well as data mapping and data presentation including DOT and XML file format, providing users with comprehensive and friendly, reusable data results.Experimental results show that the presented data security safeguards can be effectively combined with HDFS file system, making full use of metadata which can verify the document life cycle and its data sources and complex historical behavior, then forming a complete information disclosure and information mechanisms.
Keywords/Search Tags:Big data provenance, PROV model, Provenance compression, Visual analysis, Storage middleware
PDF Full Text Request
Related items