Font Size: a A A

The Design And Implementation Of Massive Small File Management System Based On HBase

Posted on:2018-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:D B ChenFull Text:PDF
GTID:2348330518497004Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet applications and the rise of cloud computing, data production faster and faster, the data center has a large number of small files per second generation, analysis and return.Mass data processing and application of information technology has become a huge problem facing the field. The traditional relational database can not satisfy this kind of large-scale data high-efficient storage and the high concurrent read-write demand.The emergence of NoSQL database, make up the relational database in high concurrent requests, scalability and other issues, can greatly save development costs and maintenance costs, and in dealing with large amounts of data storage more advantages. Hadoop is one of the hottest topics in cloud computing, and HBase, in the same camp as Hadoop,provides a solution for real-time reading and writing of massive amounts of data. HBase is an important component of the Hadoop ecosystem. Using Hadoop file system (HDFS) as its underlying storage platform, HBase is a distributed, non-relational database with high reliability, high performance,and strong scalability. HBase was originally designed to ignore many of the SQL operations and focus on strengthening the system of high performance,high capacity, high reliability and scalability characteristics of the development, has become one of the most popular database.Based on the NoSQL database HBase, this paper designs and implements a management system which can satisfy massive data storage and has real-time query function for large files. In addition to maintaining the original HBase system scalability, availability, fault tolerance and other characteristics, based on the increase of the secondary indexing capabilities to achieve efficient data query, and provide basic system management functions.In this paper, the background of the current situation is analyzed, and the concept of HBase, its technology architecture, data model and co-processor framework are introduced in detail. On this basis, a careful analysis of the system needs to complete the system function. This paper focuses on the service request processing subsystem, data storage subsystem and large system management subsystem design model and its internal business process implementation. At last, the paper introduces the test situation of the system, summarizes the whole paper and the study and work experience of the author during the postgraduate period, and prospects the future development of the massive file management system based on HBase.
Keywords/Search Tags:file management, HBase, secondary index, small file storage
PDF Full Text Request
Related items