The Research And Implementation Of Storing Massive Small Files In Cloud Storage

Posted on:2016-09-13

Degree:Master

Type:Thesis

Country:China

Candidate:C Qi

Full Text:PDF

GTID:2308330461957112

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

The arrival of the era of big data, resulting in the amount of information and data generated by human avalanche in blowout situation, especially as pictures, e-mail, electronic files such massive small files. Traditional storage technology has been unable to adapt to the era of massive data storage needs of large files, how to efficiently store large amounts of small files to deal with the current technology in the field of a problem to be solved. Big data and cloud computing go hand in hand, and cloud storage is produced out on the concept of cloud computing, it is the development and extension of health cloud computing, it has fast response, efficient management, flexible structure, etc., has become the world’s important to solve the explosive growth of data solutions. Therefore, under the cloud of small mass document processing technology is an important issue.This paper analyzes the HADDOP Distributed File System (HDFS) works, focusing on it as hot now cloud storage platform with the open-source, high fault tolerant, scalable, and cost advantages relative savings. While for the current application environment and requirements, raised the issue of the mass of small files stored in this world, with its own master data HADOOP architecture and metadata from the storage structure is proposed it will appear when dealing with access to massive small files NameNode node memory for a long time and the high share of both these problems. After the order to solve this problem, we studied the existing small file handling method to analyze the advantages and disadvantages of small files issue raised under a cloud environment, universal solution-an independent small files distributed file processing module system. This architecture is a distributed file system existing before adding a small file processing module to achieve with small files, separation, caching and other functions, which can then be processed by the traditional HDFS, not change the original structure, nor Influence of large files and small files combined treatment, thereby improving the efficiency of the entire system access to small files. Further proposed meta data types and structures to adapt by grouping algorithms, merge algorithms, search algorithm and cache method, modify the relevant interface functions, and finally achieve a new small file read and write processes. Finally, simulation systems, compared to the original method of HDFS confirmed this improved way of reducing the system and metadata in memory access time is reduced by a great help, and to improve overall system performance on small files stored.

Keywords/Search Tags:

Clod storage, massive small files, HADOOP platform, HDFS

PDF Full Text Request

Related items

1	Research On Storage Strategy Of Massive Small Files Based On HDFS
2	The Research On Storage Of Massive Small Air Cargo Files Based On Hadoop
3	Research And Design Of Massive Small Files Merging Based On Hadoop
4	Research And Implementation Of Small Files Storage Management Based On Hadoop
5	Research Of Improving Storage Of Replica And Small Files Merging And Access Optimization On Hadoop Platform
6	The Design And Implementation Of Massive Small Files Storage System Based On HDFS
7	Optimization And Implementation Of Small File Storage In HDFS Under Hadoop Platform
8	Research And Application Of Small File Storage Technology Of Massive Animation Resources
9	A Strategy To Deal With Massive Small Files In Hadoop Distributed File Systems
10	Design And Implementation Of The Key Techniques For Storing And Retrieving Massive Small Files In Hadoop