| Nowadays, Meteorological Cloud Platform, which aims to real-time process, store, query, analyze and do statistics of massive meteorological data, and provide intelligent and linear expansion of ability to process meteorological data through distributed storing and parallel computing, is one of the most popular research projects in meteorological field, but it is still in an exploratory stage. Meteorological data is important resource of various meteorological services and scientific researches, and then guaranteeing availability and reliability of data is one of major research contents of meteorological cloud storage. Replication technology of data is a significant method to solve the problem of fault tolerance of meteorological cloud storage. It not only avoids access failed and data missed, but also reduces network bandwidth and improve access efficiency.This work mainly researches replica management strategy of HDFS which exists some shortcomings on it, and put forwards new replica management strategy on initial placement stage and adjustment stage of replica. Details and achievements are as follows.(l)Offering load-based replica placement strategy. At first, determining the number of replica in accordance of the time of meteorological file name, and then choosing the DataNode to place replicas in accordance of load of DataNode. The load comes from evaluation function, which contains several elements affecting load balancing of DataNode, such as I/O access rate, CPU used rate, memory used rate, fail rate and capacity used rate, and the weight of every element is determined by Analytic Hierarchy Process. Compared to replica management strategy of HDFS, this strategy is more outstanding in load balancing.(2)Offering access prediction-based replica adjustment strategy. At first, the work describes the method of file popular statistics; and then analyzes the access characteristic of meteorological file which is of cyclicity. Afterward, it takes the BP neural network to forecast access of hot meteorological file. At last, finding out the high hot file in accordance of access, and judging whether file needs to add replica or not, how many replicas should be added in according of the load of the DataNode. Compared to replica management strategy of HDFS, this strategy can dynamically adjust the number and place of replicas according to the change of files’ access in cloud environment. It can reduce access delay, improve data access efficiency and maintain load balancing of system. |