Font Size: a A A

Research On Mimic Architecture And Key Technologies Of Distributed Storage System

Posted on:2020-12-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:W GuoFull Text:PDF
GTID:1368330620953191Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In order to meet the requirements growing of data applications under the background of big data,the storage system is evolving from the traditional centralized architecture to the distributed architecture.The original functional tasks are decoupled into metadata service and data service,which significantly improves the system's ability to scale-out,parallel service,and disaster tolerance.However,under the centralized service mode represented by the cloud data center,the data security of the distributed storage system is facing unprecedented challenges.The main reasons are the single point of failure,difficulty in comprehensive defense,inability to thoroughly check the ubiquitous vulnerabilities and backdoors,and the security risks aggravation caused by open sharing.In this regard,the researchers have put forward a series of protection methods,such as introducing traditional defense means,designing new security architecture,developing self-controlled or trusted hardware platform,data encryption,etc.However,there are certain limitations,especially the inability to effectively deal with unknown threats caused by unknown vulnerabilities and backdoors.Cyberspace mimic defense(CMD)is a proactive defense method proposed in China.Aiming at the "gene defects" of static,similar,and single attribute existing in the current information system,CMD introduces the mechanisms of dynamicity,heterogeneity,and redundancy to reform it,so that the new system can make protect against vulnerabilities and backdoors.In recent years,it has achieved good results that the research,analysis,and application testing of CMD,which its effectiveness and feasibility have been verified from the theoretical and engineering levels.Therefore,to introduce the thoughts and mechanisms of CMD into the distributed storage architecture against the unknown vulnerabilities and backdoors,can make up for the limitations and shortcomings of the existing protection methods,improving the current severe situation of data security.Based on the considerations above,this dissertation relies on the National Natural Science Foundation's innovation research group project "Research on the basic theory of cyberspace mimic defense" and the general project "Research on the heterogeneous redundancy mechanism in mimic security on cyberspace," to study the mimic architecture and key technologies of the distributed storage system.Combining with the example of the Hadoop distributed file system(HDFS)in the big data Hadoop platform,this dissertation first proposes and implements a mimic structure based on the principle of "defending strategic point",and then further explores the efficiency and robustness of the scheduling mechanism,the credibility of the arbitration mechanism,and the problem of differential placement for data replicas.The main research results are as follows:1.Aiming at the problem of single-point failure and difficulty of comprehensive defense in the distributed storage system,the defense ability against vulnerability and backdoor of the system is structurally enhanced by introducing the dynamic heterogeneous redundancy(DHR)model of mimic defense and its related security mechanism.First of all,the dissertation analyzes the main threats and attack routes faced by the distributed storage system,locates its "core weak points," and proposes a feasible security construction method combining the cost and effectiveness of protection.Secondly,with the big data storage HDFS as the target object,a metadata service-oriented mimic structure is designed,which protects the pivotal information and functions of the system by building the DHR structure for metadata service,and protects the user data by the differential placement of the replicas.Then,the security gain capability of the new architecture is analyzed through the theoretical analysis.Finally,the security improvement of the distributed storage system by CMD is verified,and the impact of its performance overhead is evaluated through the test of the prototype system.2.For the scheduling mechanism of the DHR metadata service structure,a scheduling sequence control method based on the sliding window is proposed.First,we describe and analyze the feedback scheduling process of the DHR structure to give the corresponding threat model and concerned evaluation metrics.Then,the sliding window mechanism in the computer network is introduced into the scheduling sequence control.By setting the driving events of the time and exception,the "sliding" action of the window is triggered(i.e.,updating the scheduling control parameters),so as to coordinate the dynamic internal operation state and external attack environment through continuous adjustment and adaptation.Finally,the experiments set different conditions to evaluate the necessity of scheduling sequence control research,the effectiveness of this method,and the performance comparison with the existing methods.The results show that this method can effectively solve a series of problems faced in the scheduling sequence control of CMD.It can provide better security,operation efficiency,and robustness for DHR structure through adaptive adjustment under the conditions of complex and changeable,internal and external.3.For the arbitration mechanism of the DHR metadata service structure,this dissertation analyzes the problems of confidence skewing and cheating existing in the confidence calculation based on historical performance and proposes a correction method to improve the credibility of the arbitration mechanism.First of all,the dissertation pays attention to the irrationality of the evaluation for confidence based on monotone statistics in the current mimic arbitration method.Through two simple cases,the phenomenon of confidence skewing and the corresponding attack mode of malicious using-confidence cheating is analyzed.Then,a method of confidence correction based on Logistic function is proposed to improve the rationality of the confidence calculation process.The method considers the influence change of external attack in a time dimension,classifies the decision results in different historical stages,and filters the noise of "overheated" abnormal output types.Experimental evaluation shows that this method can effectively alleviate the impact caused by confidence skewing and cheating,improving the credibility of the mimic arbitration mechanism.4.Aiming at the problem of data protection in the distributed storage system,this dissertation studies the data replica placement method based on heterogeneous clusters.Firstly,the security threat model and the HDFS system model are described,and the evaluation metrics of the replica placement method are defined from two aspects of security and business performance.Then,the dissertation proposes a programming model of data replica placement and a primary objective greedy and random search algorithm to reduce the complexity of the solution.By modeling the differences between vulnerabilities and the performance of cluster nodes,the target search set is obtained,and the replicas are placing on the nodes that are conducive to survival and processing balance,thus improving the integrity and availability of data.The experimental results show that the method can effectively reduce the data damage rate when the attack occurs.Further,it maintains high-security gains when the external attack capability is improved,or the cluster heterogeneity is limited.Besides,it has a high performance to process parallel tasks.
Keywords/Search Tags:Distributed storage system, mimic defense, mimic architecture on metadata-service, robust scheduling, confidence correction, heterogeneity-based replica placement
PDF Full Text Request
Related items