Font Size: a A A

The Research On Distributed Storage Of Massive Datas Of Air Logistics Based On NoSQL

Posted on:2018-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:F G ZhengFull Text:PDF
GTID:2348330533960204Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Aviation logistics information system is an important system in civil aviation,which undertakes frequent data exchange and data storage tasks,etc,and these data mostly exist in the form of massive small files.The storage of these huge amounts of small files has already become a difficult problem in industry due to its variety and large scale.At present,most companies use Hadoop Distributed File System(HDFS)to store data,however,HDFS is specially designed to store large files.While storing small files,HDFS faces many problems,like low storage efficiency,a large memory footprint,single point of failure,low disk space utilization rate,and so on.Therefore,it is particularly important to research the distributed storage method of massive small files for aviation logistics.Firstly,an optimized storage method for storing small files of aviation logistics is proposed in this paper,in view of the existing problems in HDFS storing massive small files,analyzing the characteristics of aviation logistics key data.This method contains small files merging process,prefetching mechanism and permanent method,considering the correlation between files for improving files storage and access efficiency;considering the temporariness of files for improving disk space utilization rate.In addition,a distributed multi-level storage architecture of massive aviation logistics data is proposed based on NoSQL through the combination of optimized storage methods,using Redis as cache level,while using HDFS as active and permanent level,writing,access,and persistence are independent,so as to reduce coupling between modules;Redis is deployed to DataNodes of HDFS to constitute a cluster,in order to reduce single-point memory pressure of NameNode in HDFS,improve concurrent performance and index retrieval speed,and maintain the scalability of metadata services.Finally,through the distributed storage experiment and analysis of massive aviation logistics data,the access efficiency of small files is obviously improved,in the case of sequential access,the access time is about 160 milliseconds,the disk space occupancy can be reduced up to 79%.
Keywords/Search Tags:Aviation Logistics, Massive Small Files, NoSQL, Redis, Multi-level Storage, Distributed Storage
PDF Full Text Request
Related items