Font Size: a A A

Load Balancing Technology Research Based On Network Coding In Cloud Storage System

Posted on:2014-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y W LvFull Text:PDF
GTID:2308330482952240Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Now we live in a digital information age, seeing Internet technology has an unprecedented rapid development and wide application. IT industry’s focuses change from devices and applications to information-centric, data storage services. Business consumer body of data gradually shifted from enterprises to individual users, while individual data are generated by vast majority of pictures, documents, video and other unstructured data. Due to the restrictions of scalability, high availability, fault tolerance and delay access, the traditional IT storage technology becomes weak in the face of mass storage.Cloud storage technology can easily expand storage capacity by increasing the unit nodes, coupled with its high availability, reliability, secure storage and other features, a growing number of large companies began turning to distributed cloud storage for market competition, these are several well-known commercial cloud storage service:Apache’ open source implementation for GFS- Hadoop, Microsoft’s Live Skydrive and Dropbox, Amazon’s highly available and scalable distributed data storage systems such as Dynamo. Due to its good features and open source implementation, HDFS (Hadoop Distributed File System) gradually becomes a mainstream basic cloud storage application supporting platform.HDFS using multiple-copy replication redundancy, along with the sacrifice of storage space and bandwidth, there is a certain waste of transmission resources with the growth in the case of mass data, resource consumption is increased linearly,in such a file system whose operating is frequent in the long run, late data cleaning and maintenance costs is not optimistic. This article is based on the system NC-HDFS, which add slice and coding mechanism in HDFS, greatly reducing redundancy, more adapted to the storage expansion, (n, k) strategy can also be changed according to different needs of slice size. In addition, the coding can provide repair fault tolerance mechanism, improving system robustness. Due to the improved system’s new features, it needs for load balancing mechanism to optimize read and write. The main contribution of this paper has the following three points:(1) based on HDFS’s new features coming with network coding in distributed storage system and HDFS’s monitoring defect of only collecting data for disk capacity parameters, designed and implemented resource monitoring module, supporting for real-time collection and management such as CPU, memory and disk I/O utilization parameters, provided decision making basis for dynamic load balancing optimization design.(2) solved update issues of real-time information for the resource status, further optimized the heartbeat protocol and the data structure of name-node which save HDFS’node-side state performance related information, so that the newly added node reference data can be aware by name node for receive and process in time, in support of multi-dimensional information update at the same time, provide support for the name node to be aware of data node’s dynamic changes of information.(3) The data for the NC-HDFS literacy needs, is established an dynamic request scheduling mechanism based on multi-attribute constraint to achieve n files’s nodes selection where the data block are storage when writing a file,and select k nodes from n for reading files effectively improve the efficiency of read and write files, and achieve load balancing of resources usage within the system.
Keywords/Search Tags:Cloud Storage, HDFS, Resource Monitoring, Load Balancing
PDF Full Text Request
Related items