With the development of computer technology,cloud computing technology has gradually become the focus of attention.Cloud storage technology is largely dependent on the Hadoop platform,and a distributed platform developed by Hadoop by Apache shows great advantages in cloud computing and distributed storage.Large enterprises have done a lot of research on the application of Hadoop to cloud storage.HDFS is a distributed file system of the Hadoop platform.It has powerful data storage capacity and expansion ability.However,because of the diversity of business and the difference of data format,HDFS has some defects in the design.It needs to be optimized to be applied to specific scenes.The main research of this paper is to build a general cloud storage system,and to solve the problem that the HDFS file system in the cloud storage system has a large memory consumption and low efficiency in storing large quantities of small files.When the HDFS based cloud storage system stores small files,it stores a large number of small files on different data nodes and takes up data blocks separately.This strategy leads to serious memory consumption and low access efficiency.By comparing and analyzing the storage methods of a variety of small files,this paper puts forward a special processing scheme for small file storage,combines the small files first,and saves the index information of small files by key value pairs,and improves the memory consumption and access efficiency of small file storage.The Web program is developed and the cloud storage system is built on the basis of Hadoop.Then the improved small file storage method is compared with the original small file storage mode,and a large number of tests are carried out.The test results show that the improved cloud storage system has a better efficiency when saving small files.The system designed in this paper is based on the HDFS file system and improves the storage mode of small files.The improved cloud storage system improves the efficiency of Name Node memory utilization and file access efficiency.This result provides data reference for further research on the further research of cloud storage system based on HDFS file system. |