Font Size: a A A

The Design And Implementation Of Small Object Storage System With Multi Read And Multi Write

Posted on:2022-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X B WuFull Text:PDF
GTID:2518306524493414Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of AI technology,its demand for data is also growing.But in the scene of massive small files,the existing distributed storage system can not meet the needs of AI training.However,the cost of GPU used in AI training is high.If the distributed storage system can't provide the matching I/O speed,it will not give full play to the computing power of GPU,which will waste valuable computing resources.Therefore,based on the scene of massive small files,this thesis designs and implements a multi read multi write small object storage system for AI training process.The specific research contents are as follows:(1)Aiming at the problem of small object's write performance and the demand of range finding,this thesis refers to the existing K/V separated LSM tree scheme to solve the problem of small object's write performance,and redesigns the process of garbage collection,so that the data can ensure the order to meet the demand of range finding;at the same time,designs the high and low water levels for garbage collection to measure the space utilization,so as to judge the garbage collection In order to improve the income of garbage collection,and delay unnecessary garbage collection,avoid affecting the normal business of the system.(2)Aiming at the data distribution problem of small objects,this thesis designs an automatic hash slot method based on the existing hash slot method to solve the problem that it needs to manually allocate hash slots.It can realize automatic initial scheduling at the beginning,and balanced scheduling when the load is unbalanced and the node is down.(3)Aiming at the problems of small object reading performance and metadata storage,this thesis combines the method of automatic hash slot to disperse the metadata of small objects to each node,so as to avoid limiting the scalability of the system due to managing the metadata of small objects on one node,and improve the efficiency of data reading with the help of the metadata of small objects.(4)Aiming at the application problems of cloud storage platform in this system,this thesis designs a delayed confirmation mechanism to improve the efficiency of data storage and the transmission efficiency of client data.At the same time,with the help of client cache data,it avoids the trouble of internal nodes needing data migration due to partition migration,and simplifies the process of partition migration.(5)Aiming at the problem of single point of failure in storage system,this thesis adopts the architecture of one master and multiple slaves.The master node manages resources and schedules tasks,and the slave node acts as the standby machine of the master node;At the same time,in order to reduce the pressure of the master node,the master node synchronizes the key metadata to the slave node in real time through the cloud storage platform,so that the slave node undertakes the task of key metadata distribution and realizes the diversion of the client.
Keywords/Search Tags:small objects, distributed storage, LSM-Tree, high availability
PDF Full Text Request
Related items