Font Size: a A A

Research And Implementation Of Storage Policy Of Hybrid Distributed Storage System

Posted on:2018-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:C X WuFull Text:PDF
GTID:2348330542468907Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of space industry,the number of artificial satellite launching increased dramatically.Artificial satellites continue to detect and monitor the Earth and space environment and generate massive amounts of data.Existing satellite data storage systems need to send satellite data back to the ground station for storage and processing.The method has low communication efficiency and real-time performance and is not able to meet the growing data storage and processing needs.Because of the hugeness and diversification in forms and structures of satellite data,multiple storage systems were required to store different data separately,thus the operation is very complex.Therefore,it is necessary to establish a hybrid distributed data storage on geostationary earth orbit satellite cluster to store date with various types and structures,and provide a unified data services for the applications on satellites.Based on the research of the existing storage systems,a hybrid distributed storage system is proposed which combines various kinds of storage forms such as relational database,non-relational database and distributed file system and provides unified operations for the structured,semi-structured and unstructured data.A hybrid storage management model is designed to store the data types of different data structures with a data type management service and to provide unified client interfaces for upper applications to process data operations.The hybrid storage management model acts as an intermediate layer between various storage systems and upper applications that converting between client interfaces and actual interfaces of the various storage systems to provide data services equivalent to multiple storage systems.Distributed file system use replicas for data redundancy backup,improving fault tolerance and balancing the load.According to many studies,different files vary in query frequency,so there are hot files and cold files.The existing replication strategies are either not able to make good use of the characteristics of the file heat or occupy too much storage space,which is not suitable in satellite environment.In this thesis,a dynamic replication strategy is designed to distinguish the heat of the file through the query frequency in an observation period.The strategy dynamically adjusts the number of replicas of a file according to the heat of it.Since the communication speed between satellite stations is lower than that within a satellite site,a replica location selection strategy is designed considering the influence of distance between nodes and load on each node.Based on the relational database MySQL,non-relational database MongoDB and distributed file system HDFS,a prototype system of hybrid distributed storage system is designed and implemented.The equivalence between the prototype system and each storage system is verified and functional tests are performed.The dynamic replication strategy is implemented on HDFS.It is verified by real environment experiments and simulation experiments that the dynamic strategy proposed by the thesis improves the query speed of files and decreases the storage space occupation comparing to the static strategy of HDFS and dynamic strategy Scarlett.
Keywords/Search Tags:hybrid distributed storage, unified interface, dynamic replication strategy
PDF Full Text Request
Related items