Font Size: a A A

Research On Metadata Access Technology Of Distributed File System

Posted on:2017-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:L X XieFull Text:PDF
GTID:2348330485481660Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development of Internet and people's consumption concept is changing,Internet application has penetrated into people's lives,What bring to the Internet companies is not only surge in the amount of users and interest,but also brought the high concurrent file access problem and a large number of small files need to store.Distributed File System has provides a very good platform to store large number files.Most of the mainstream distributed file systems adopt centralized metadata storage structure and mainly designed for big file storage.At present,many web sites need to store a large number of image files,and most of these image files are less than 1MB,The mainstream distributed file system is not good at store small files and quick search in big array.Also the request for files of the large web site is very high,and the centralized structure is hard to meet the requirements of high concurrent resource requests.There are two shortages in distributed file system where used in large web sites:(1)the efficiency of storage and search of small files is not high;(2)the centralized structure is difficult to support high concurrent access.At present the research on the optimization of small file storage is mainly focused on two directions: the optimization of metadata storage and the optimization of data file storage.Due to the high concurrency problem is related to the metadata storage structure,this paper focuse on the study of metadata storage.The main research contents of this paper are as follows:(1)Proposed ordered hash tables for storage.Because the number of sub directory in the distributed file system is tens of thousands and the retrieval speed is slow.So this paper proposed to construct an ordered hash table to solve the problem.The experimental results show that the ordered hash table is better than the dynamic array retrieval speed by 99.93%.(2)A distributed tree structure is proposed.The directory system that is built on a distributed directory tree,it can be store in some different server,and keep all the data in a logical tree structure.Considering the small file storage occupied too many memory of the metadata server,and the metadata server's memory is limited.The distributed tree structure could improve storage capacity by adding metadata server.By this way couldstore more small files.Compared with HDFS,the Distributed directory tree is more than 16.4% of the storage space use,but the Retrieval speed has increase 73.21%.(3)Proposed HTTP protocol as service provider.Most of the distributed file systems use the RPC protocol as the interface between servers,which makes the browser can only access to some specific servers to access the service.The experiments prove that HTTP protocol can make better use of browser cache,and reduce the number of requests.In this paper,the metadata has been create multiple copies and save to a different server,This method not only improve the reliability of the system but also makes the browser can access to the copy server,reducing the number of requests to origin data server and reduces the load on the server.The results show that the number of requests for partial orders decreased significantly after using the HTTP protocol,and the degree of decline reached 35.5%.
Keywords/Search Tags:small file storage, high concurrent access, ordered hash table, HTTP, distributed directory tree
PDF Full Text Request
Related items