Research On Data Storage Management Technology Of Science And Technology Cloud Platform

Posted on:2017-03-31

Degree:Master

Type:Thesis

Country:China

Candidate:Q Li

Full Text:PDF

GTID:2308330482488694

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years, the country continues to promote the cloud computing industry, which combined with specific industry. As one of the famous open source framework of cloud computing, Hadoop has also been a unique favor, many enterprises are the development of this technology. The national science and technology management system will also based on cloud computing technology, to ensure the high availability of data storage and convenient viscoelastic extension of storage space or computing performance in the future.We undertake the design and development of the similarity check system of science and technology data. It uses MapReduce to realize the parallel computation of the full text comparison of the project in the Hadoop platform. All of the file data is stored on the Hadoop distributed file system HDFS. It contain legacy devices and new purchased ones to taking full of old devices, and these devices are quite different in storage performance, calculated performance as well as 10 performance. In the actual operation, it is found that the uneven distribution of the data blocks will reduce the running speed of MapReduce, which will affect the speed of the Hadoop cluster response. Because of the default rack perception storage strategy of HDFS without considering the different of nodeâ€™s performance, it is possible to make a high frequency data stored on the low performance nodes, at the same time, the low frequency data more likely to store on high performance node, then impact on the cluster response time, as well as reduces the resource utilization.To solve these headache problems, our team propose a hierarchical storage scheduling mechanism. On the basis of HDFS rack perception scheduling policy, Firstly in accordance with the nodeâ€™s CPU, memory size, disk size, disk I/O and other inherent hardware performance, dividing nodes into high configuration node and opposite of low configuration node; secondly according to the nodeâ€™s CPU usage, memory usage, network bandwidth usage, disk usage and other performance dynamic factors to establish performance evaluation model of the node, and to build three performance levels p1, p2, p3, from high to low, to evaluate the performance of nodes. Making integrated scheduling according to the node configuration, performance levels, network location and other factors. According to the data access frequency to dynamically adjust the distribution of the data block in the process of cluster running. It is to improve response rate of cluster by making a high frequency data stored on the high processing performance and high configuration node. On the other hand, removing a low frequency data from high configuration node for space-saving.The time for calculation has been increased by 6%because of the improved scheduling strategy of hierarchical storage about replication which is apply into the similarity check system of science and technology data to find the repetition.

Keywords/Search Tags:

Cloud storage, HDFS, Heterogcneous cluster, Hierarchical storage, Storage scheduling

PDF Full Text Request

Related items

1	The Design And Implementation Of Cloud Storage System Based On Hdfs
2	The Implementation And Optimization Of Cloud Storage System Based On HDFS
3	Research And Design Of Cloud Storage System For User Data Security
4	Research Of Data Storage And Management On Huatu Online Library System Based On HDFS
5	Key Technologies Research In Cloud Storage
6	Research On Key Technology Of Cloud Storage Based On Hdfs
7	Based On HDFS Application Of The Enterprise Information System In Cloud Storage Platform
8	A Study On Secure Cloud Storage Model Based On HDFS
9	Research Of Data Synchronization On Storage Technology Based On Cloud Storage And P2P
10	Research And Design On Hadoop-based Cloud Storage Platform Of New Campus