Font Size: a A A

Research And Application On Massive Data Storage And Management

Posted on:2018-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2348330518995316Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,people discover great value from massive data.However,with the exponential growth of data,it has been a grim problem to all enterprises how to efficiently store and manage data.A recent arising strategy is to store data based on its access frequency.Usually,the cold data with low accesss frequency occupies over 80% of the whole data,that's what we are interested in this research,we aim at developing a new system to store and manage cold data with low cost.First of all,this thesis summarizes the background and research status of cold data storage,which including cold data storage schema and cold data storage characteristics.Secondly the requirements of the system are introduced in detail,and we compare the requirements and existing solution,which point out how far those solution is from our requirements.Then we come to the ideas behind the system design,and the architecture of our system and the work flow of our system.Then we come to how to implement our system,we introduced the features of client program and the core modules of the systems and how they are implemented,which including how fault-tolerance and distribution of the system are achieved.Then we come to Reed-Solomon codes,which the system uses to ensure the integrity of data.The principle and implementation of classical Reed-Solomon codes are introduced and then we introduce how to improve it using Cauchy matrix,and then we use a new way to convert mult and div to xor to speed our algorithm.We also take some experiment to all the three kinds of schema.Then,the data management system based on our internal requirements is introduced.To be compatible with HDFS,the cold storage system implements FileSystem API of HDFS.Next we give an performance testing of the whole system.we introduce our experiment environment and the design of the experiment program. We use chart and tables to show IO performance and power consumption of the system.At last,we summarize our work in the system and directions for improvement.
Keywords/Search Tags:cold data, distributed storage, reed-Solomon codes, low power comsuption, massive data management
PDF Full Text Request
Related items