Font Size: a A A

Research On High Availability Management Method Of Metadata For HDFS File System

Posted on:2014-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:L L YiFull Text:PDF
GTID:2268330401959157Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid popularization of Internet and the fast development of informationtechnology in all fields, various kinds of information data have increased by order ofmagnitudes since the first12years in this century. and the traditional storage methodobviously cannot meet the increasing needs of data in terms of capacity and management.Fortunately, the distributed storage system offset the disadvantage of tradition storage methodin storage capacity and storage management. Therefore, it has become the hotspot in the fieldof computer information storage methods.Distributed file system is of great importance since it is the core technology of thedistributed store system. therefore, studies on distributed file system is of vital practicalsignificance.HDFS is an open source coding distributed file system and has been widely applied inthe distributed storage system. However, it generally stores the metadata on a singlenamenode, which brings unfavorable influence to the I/O performance of the distributed storesystem. Furthermore, the hidden trouble of single node exists in this distribution structure,which makes it very difficult to guarantee the availability and stability of metadatamanagement.To solve the above problems, this thesis conducts an in-depth research of the highavailability management method of metadata for HDFS file system. The main method andachievements are as follows:1. Present a metadata management system with high availability based on the analysisof metadata management principles in HDFS.2. Bring forward an improved communication method between datanode andnamenode. Managing metadata and business data separately is one of the keycharacteristics of the distributed storage system, and the change of namenodestructure will cause the change of communication mechanism of datanode.Therefore, how to improve the communication mechanism between datanode andnamenode is very important.3. Design a two-node metadata management system based on HDFS. which consists oftwo paratactic processing node of the metadata. The two nodes process the requestfrom client simultaneously and conduct the consistency check with each other.Therefore, when one node fails, the entire system can still provide external servicewithout transferring the data. Finally, this thesis applies the high availability metadata management system to theproject “key technology of field test of user behavior analysis engine in integrated platform”,and conducts a packet of comparison tests with original HDFS. The test result shows that theimproved two-node metadata management system makes a great progress in stability, andenhances the availability and stability of the entire HDFS file system.
Keywords/Search Tags:distributed storage system, HDFS, file system, metadata, high availability
PDF Full Text Request
Related items