| In recent years,with the rapid development of the Internet,cloud computing and big data,explosive growth of the global data has made storage systems face large challenges.The distributed file system have brought a solution to store these massive data.The distributed file system often have good performance for large files.However,in terms of the growing number of small files,there is bad performance for those small files because of the low throughput of the metadata servers and the low network bandwidth utilization.The Data Storage and Application Laboratory has independently developed a distributed file system called Cappella.Through a detailed analysis of small files access process in the Cappella,the most time consuming part of the small files access in Cappella is disk access in object storage server.According to the present situation of the distributed file system that has a feature of cluster of servers,the optimization strategy for massive small files is proposed.Firstly,the scheme optimizes the complex access of small files that optimizes and improves the small file access performance.Secondly,small file data and metadata are stored in the metadata server together that can reduce the times of disk access in the process of file access.Thirdly,the I/O path of the small files access is shortened in the optimization,and the data is taken batch when brushed back to servers.These actions also improve the small file write performance.Lastly,cache and prefetching tech is used in the client of Cappella to improve the performance of lots of small files,and to ensure the consistency of multi-client concurrent access through the callback mechanism.Mdtest is used for testing the metadata IOPS of distributed file system,Cappella and Lustre.Meanwhile postmark is used for testing small file performance and IOzone is for big file performance.The results show that the metadata IOPS is improved almost two times.When the testing files are all small files,small file read performance has larger ascension,about 36.26%~100.80% and write performance also has greatly enhanced,about 36.03%~100.93%.Compared with Lustre,the read and write performance for small files are also improved,as well as the performance for large files. |