Font Size: a A A

Research Of Metadata Replica Technology Based On Access Heat Classification

Posted on:2017-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2348330503989812Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In distributed file system, the metadata server is the core of the whole system, and copy technology is one of the major ways to improve system availability and performance. Existing metadata technologies simply copy and save the metadata. These methods ignore the fact that metadata has some features, such as locality and correlation. The existing metadata replica technologies have been unable to meet the needs which are low-latency and low-latency in big data era. Thus, on the base of the existing metadata replica technologies, proposing a new metadata replica technology is very important.This paper analyzes and summarizes the access behavior of metadata. We propose a metadata replica generation proposal which is based on Access Heat Classification. Firstly, we use the k-means clustering algorithm to bring the metadata together which has similar access heat. Then, the metadata server will generate metadata replica and send the metadata replica to other metadata server. When the client send request to the metadata server again, the metadata server could send the replica which contain the metadata to the client. Client need not to interact with the metadata server which could find the metadata from the metadata replica directly. This method could greatly reduce the response time of the metadata server, and improve system performance. At the same time, the metadata server could constantly adjust and update metadata replica according to the hit rate of the metadata replica.According to the comparison test of the metadata replica generation scheme, we could get the following conclusions. The metadata copy technology based on access heat classification can improve metadata reading operation rate significantly. With the increasing of the file directory, the response time of the system which uses the clustering algorithm is shorter than the system which does not use the clustering algorithm. The response time of the system which has two metadata servers is shorter than the system which has only one metadata replica server.
Keywords/Search Tags:distributed file system, metadata replica, response time, classification
PDF Full Text Request
Related items