Font Size: a A A

Research On Big Data Storage Mechanism For Column Database

Posted on:2019-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:K F LiFull Text:PDF
GTID:2348330542998833Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of Internet technology,the amount of data presents explosive growth,and constantly challenges the big data storage technology.As an important part of the big data applications,data storage always attracts the industry scholars.Distributed storage supports the storage of big data on multiple server nodes.Compared with traditional centralized database,distributed storage not only improves storage capacity and data processing ability,but also has higher fault tolerance and reliability.Currently,HDFS is one of the most popular distributed storage systems.The storage system basically guarantees the balance of data storage between nodes,and has higher storage reliability.However,HDFS does not guarantee the equilibrium of each field among the nodes for the column data,and the replica storage scheme does not achieve the highest storage reliability.This thesis focuses on these two issues of research,and puts forward a new distributed storage algorithm,and realize the equilibrium mechanism.The main work of this thesis is as follows:(1)According to the HDFS distributed storage algorithm cannot guarantee the column data of each field storage balance between nodes,this thesis proposes a distributed storage algorithm based on balanced field,every field data for each column storage database at each node between the maximum difference of a data block equalization effect.(2)In the process of distributed data storage,the different copies of the same data block are stored across switches,which improves the reliability of data storage.(3)According to the new cluster node,this thesis proposed a load balancing algorithm in the principle of ensuring replica cross switch storage,to reach the clusters a balanced state based on the field,and the experiments proved that the algorithm can make each field in each node stored between the difference of not more than 1 block of data.
Keywords/Search Tags:Data storage, Field equalization, Reliability, load balancing
PDF Full Text Request
Related items