Research Of Massive Data Storage And Management In Cluster Environment

Posted on:2011-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Hu

Full Text:PDF

GTID:2178360305478424

Subject:Computer software and theory

Abstract/Summary:

In the domain of the oil exploration, remote sensing, the massive data file occur in the way of TB number of levels frequently. In the storage process, often due to lack of capacity of single storage devices which led to massive data files can not be stored, only by adding new storage devices to solve this problem. Currently, many of the technologies provide the function to deal the multiple disk array for a virtual disk to meet the level of TB data storage. But this can not avoid the problem of the "edge" data storage in multiple storage systems, namely, the remaining capacity of a disk system can store only a part of seismic data.Second, the different storage methods of a variety of storage devices (such as tape drives), has resulted in storing massive data can not be unified and efficient, but rather through the form of transcription, reducing storage efficiency of the storage devices, a great impact on the enterprise efficiency. Cluster environment, the massive data storage also depends on efficient task scheduling between the nodes, the more balanced use of resources, the operating response time shorter. Therefore, a suitable task scheduling algorithm between nodes play a very important role for the shorter the average response time and improve the efficient use of node resources and thus improve the storage performance of massive data.For these reasons, in the relevant areas,need a model mechanism to manage for the massive data in the clustered environment.In this model system, various media storage devices will be a unified storage,massive data will be stored across-disk and across-medium, and use of efficient task scheduling algorithm to reduce the operating average response time and increase storage efficiency.Put forward the corresponding across-disk storage method and the scheduling algorithm testing program, and implement a prototype.The main research contents of this article are as follows:(1)Though the research to the storage applications of the specific massive data in the oil companies, analysis the storage features of the tape drive and many other media storage devices, use of pipeline technology, process mechanisms and the calls of the underlying 10 system, shield the heterogeneity of the storage devices and put forward two sets of uniform memory access interface of the storage devices, the eventual realization of a multi-media storage devices unified storage, and for comparison the two sets of solutions in data security and buffer size.(2)Learn about the problem of the massive data files can not to store which due to inadequate capacity of the single storage devices encountered in the storage massive data in the oil companies, analysis of research actuality of massive data across-disk storage, using the underlying file IO memory access interface, put forward the storage and access mechanisms of the massive data across-disk, including a set of bottom file across-disk reading and writing interface, and the corresponding configuration across-disk operating system prototyping, achieved massive data across-disk storage, and test the corresponding memory interface.(3)Shorten the average response time of operations play an important role to improve the efficiency of the enterprises, this paper analyzes the basic dynamic load balancing algorithms, be used in the combination method of the Weighted Round-Robin, and put forward a load-balancing scheduling algorithm, by comparing the performance of the experimental algorithm, this algorithm has a short response time, the time of load balancing is few, low overhead and so on, and ultimately improve the wording efficiency.

Keywords/Search Tags:

Mass storage, File operation, Pipeline, System call, Scheduling, Load Balancing

Related items

1	Research On Load Balancing Technology In Distributed File System
2	Research On Techniques Of Load Balancing In P2P File Storage System
3	Design And Implementation Of Multi-Tenancy File Storage System For SaaS Applications
4	Research On Small File Storage Technology For WEB Application
5	The Research And Implementation Of Load Balancing And Replica Consistency Technology Of KYLIN TIANJI Storage System
6	Research On Agricultural Data Cloud Storage Based Oil Distributed File System
7	The Storage Of Small Files In Distributed File System
8	Research On The Key Techniques For Parallel File Storage System
9	Based On The Hadoop Mass File Storage System Analysis And Design
10	Research And Implementation Of Cloud Storage System Based On Moose File System