Font Size: a A A

Research On Data Distribution Evolution And Related Key Technique For Mass Storage System

Posted on:2011-11-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y D WangFull Text:PDF
GTID:1118360305992260Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
So many new changes appear in terms of size and architecture of storage system, led by the rapid development of information technologies, evolving towards large-scale, complex direction. At the same time, system service I/O workload also appears to be various, unbalanced and dynamic. However, current mass storage systems are often directly inherited from traditional small-scale storage system structure and operating mechanism. Therefore, it is difficult to satisfy the dynamic, parallel and diversity characteristics of requirements by a large-scale I/O workload. The physical and logical organization of existing storage system is based on static structure, and the structure is difficult to perceive the external workload request status characteristics and the system dynamics changes. So the system can't adjust its storage organizational structure to satisfy different I/O workload changes in temporal and spatial. It can't effectively and automatically increase the overall efficiency of system. To address the problems mentioned above, data distribution evolution mechanism for mass storage systems is designed, and dynamic data access workload characteristics is analyzed, to predict future access trends according to the history of the data access record by the heat model. Different performance level resource groups with different heat data, dynamic data migration and re-distribution are matched, in order to achieve the purpose of improving overall storage efficiency. Data distribution process of evolution is completely automated, controlled by the evolutionary rules, scheduling through rule management system.Large-scale storage systems evolution techniques, which can adjust the storage organizing model automatically, is described according to the current running environment. The system can run the process according to I/O workload and its status, and automatically select the most suitable system organizational model for the current workload characteristics, to satisfy the performance and reliability requirements under the workload of multi-user environment. The study includes the system physical evolutionary method, the system logical structure evolutionary method and data distribution evolutionary methods. Special design of data distribution mechanism is proposed for system evolution. Heat computing model of data access is built for calculating and predicting the data workload hot spot in quantifying. Different from general data heat research which only involves frequency of data access, improved heat computing is also taken into account to the access time series factor, so that it can more effectively reflect the history of the workload information, to more accurately reflect the future trends of I/O workload. Analysis and definition of heat are made separately for files and LUNs. The heat computing model is tested by the real trace data. And in-depth analysis of the heat equation result with actual data is given. As a result, data heat is positive related with the number of visits and frequency, and a negative correlation with access interval time. And experimental results also show heat formula can forecast the trend of system future access behavior well.Storage system data migration mechanism of data distribution evolution method is designed. In data distribution evolution, system needs to dynamically adjust the data distribution, to satisfy changes in the system workload and improve overall efficiency of the system. In general design, system storage resources are graded according to RAID level or RAID group. However, in the evolution storage systems, all storage resources are graded in accordance with the system performance and reliability characteristics. Based on the locality principle of program access, according to different behavioral characteristics and needs of the hot data, matching the appropriate grade of storage resources, different storage pool storage resources are effectively utilized that significantly improves the overall efficiency of evolutionary storage systems. Trigger conditions and data migration costs are defined in data migration strategy, and designed the replacement strategy of evolutionary storage system data. In the experiment, the simulation system validation of tiered storage data migration effect on performance.Independent evolutionary rules management system is designed to realize the automatic management of mass storage systems. In the large-scale storage systems, physical systems management and massive data logical organization and distribution are extremely complex and dynamic, so that it is not feasible to rely solely on manual management, and a system based on a series of rules is need to manage and schedule system running. In usual systems, all parameters of rules are hard-coded in the codes, making the definition of the rules, changes and inquiries very difficult. In the rules management system, the definition of terms by the rules and the import of decision table and tree management make the system flexible, clear and fast definition, query and change the rules, and reference to records by the rules of statistics and analysis of rules usage.Designs and constructs evolutionary storage system, which can be adapted to their running environment, is studied. A new attempt in the data access characteristics of the data distribution evolution, data migration mechanism, and the evolutionary rules administration is made. The experiment proved a good running result.
Keywords/Search Tags:Evolutionary storage system, Data distribution evolution, Hot spot, Data migration, Evolutionary rules
PDF Full Text Request
Related items