Font Size: a A A

Study On Bigtable Distributed Storage System

Posted on:2015-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:X L ShiFull Text:PDF
GTID:2298330431465780Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Cloud storage is an critical part of cloud computing, and it has a huge market. AsGoogle’s third key technology used in cloud computing, Bigtable is a distributedstorage system for managing structured data, which is designed to scale to a very largesize. Bigtable has successfully provided a flexible, scalability, high-performancesolution for Google’s products, and it is the standard model for all distributed storagesystem in the era of cloud computing. However, there’s still a lot of room forimprovement technically.The implementation and architecture of Bigtable are detailed in this paper, andproblems such as bulk insertion, Master failure recovery, compression mechanism ofSSTable, TabletServer restart, low access efficiency in time dimension and readlatency are also presented. After analyzing the various causes of the above problems,several possible solutions are given in this paper. Each solution has evaluated onsimulation test or performance analysis to demonstrate its superiority to the originalones.To solve the problem of bulk insertion tablets frequently splitting using Bigtable’soriginal methods, a new forecast period based bulk insert scheme is proposed in thispaper. To address the time and resource consumption caused by the complexity ofoperation process, a recovery mechanism with checkpoint of Master is designed byapplying the checkpoint mechanism to the recovery of Master. To settle the highrequirement for period selection, a new compression method based on quantity israised. To dispose the unnecessary network communication and data transmission dueto the fact that SSTables on the local disk not been effectively used, a newTabletServer initialization scheme is given. To handle the low access efficiency intime dimension problem, a novel SSTable format with time index is designed. To fixthe read operation delay issue in Bigtable, a new approach for the TabletServer to readfrom GFS while communicating with the client is proposed.
Keywords/Search Tags:Bigtable, Bulk insert, Compression, Master, TabletServer
PDF Full Text Request
Related items