Font Size: a A A

Design And Analysis Of The Mass Image Storage Model Based On Hadoop

Posted on:2012-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2248330395962372Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and thorough application of Internet technology, there has been more large web portals and e-commerce sites, such as "Sina""Tencent""Taobao",etc. These sites stored explosive increase of picture resource. In view of the high cost of the commercial storage expansion, to construct the premise of cheap efficient image storage management system on the high concurrent premise, has become the most headache for software architect.Cloud storage broad our scope of mind. By the research and analysis, we can use distributed storage to solve those problems. On the basis of the analysis of the existing distributed storage system at home and abroad, by the research of Hadoop’s HDFS and MapReduce, the business needs analysis of image storage and the strength assessment of the actual hardware and software, the paper brings forward a mass image storage model which is based on Hadoop. The implementation of the model is based on the distributed file system of Hadoop’s HDFS, and the hardware constructs on the ordinary machine cluster of Linux. Through internal monitor to achieve high fault tolerance, high response, load balancing, it provides external services to meet the high concurrent and high reliability. It uses the HA structure and smooth expansion to ensure the availability and the expansibility of the whole file system. Meanwhile, it uses the flat data organized structure, abandons the directory structure of traditional file system. It can mappe to the file’s physical address of the filename, simplifies file access process, provids a good reading and writing performance.This paper’s main research contents and innovations are as follows:Firstly, by summarizing today’s Internet development to the demand for image storage, the paper analyses the disadvantage of the traditional commercial storage, introduces the development of the distributed storage at home and abroad, brings forward a mass image storage model which is based on Hadoop.Secondly, in accordance with the requirement analysis of image storage, based on the storage model of Hadoop’s MapReduce, the system achieves optimum programming, establishes image storage model and achievees highly reliability when the image storage under high concurrent accesses.The system uses Master/Slave architecture, under the management of Master, achieve the high scalability and high fault tolerance under the condition of the cheap PC deployment system. In addition, through the design of load balancing and caching system, the model achieves the optimization of each storage node and the stableness of the storage system.Thirdly, Hbase, based on the distributed of Hadoop, stores image metadata. By the image filename design and the index optimization, it achieves the physical position of the same type of images stored as close as possible, and improves the query efficiency of the mass image data.Finally, the paper sets up the test cluster system, by a series of experimental datas and charts, analyses the feasibility of the model system and verifies the practicality and effectiveness of the method mentioned in the paper.The characteristic of this topic is to design storage model for a specific image storage business. The system meets the design requirements of high scalability, high reliability, high fault tolerance and low cost. The model uses the latest distributed technology, and designs model deployed to the Linux cluster, conducts a feasible experiment. On the basis of the experimental data, it verifies that the mass image storage model based on Hadoop is reasonable.
Keywords/Search Tags:Hadoop, distributed, cloud computing, image storage model
PDF Full Text Request
Related items