A Thesis Submitted To University Of Science And Technology Liaoning

Posted on:2016-03-16

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Qin

Full Text:PDF

GTID:2308330470980894

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With people’s increasing demands for Internet, many major Internet enterprises take all aspects into consideration from functions to usages,even to users’ habits, resulting in that all the services from product interfaces to users’ experience approach to perfection.Examples such as Amazon can recommend books, Google can recommend related websites, taobao knows our favorite products, QQ can guess whom we know, and recently the popular wechat can add friends through contact list and QQ friends recommendation. What’s more, some software can even predict stock market performance through the data information of social network “Twitter”. However, all of these can only be realized by abundant data analyses, whereas the abundant data, from MB, GB even to today’s PB, still face the problem of data storage. Traditional databases are incompetent of storing such abundant, diversified and decentralized data. Industry giants home and abroad, such as foregin Google, Microsoft, Amazon, domestic BTA(Baidu, Tencent, Alibaba), also take the research of massive data processing as their backend core technology. How to provide higher stability and greater availability services have already become the bottleneck of all enterprises. The question of how to solve data missing, damage and delay needs to be solved imminently.This paper, based on Level db, designs a clustering system, which is applicable to enterprise class data. An enterprise data storage system, applicable to abundant data storage with high realiability and availability, can be realized through using the high-efficiency and stability of Level db, cooperating with Zookeeper and Twemproxy. Master-slaver deployment has been adopted in order to avoid concurrency single points due to high request pressure. This system uses a proxy server to divide a large database into many databases, and then store them in different servers respectively. As a result, every sub-database can be stored in different servers. If some sub-database shut down, then only part of the data will be missing. “One main and two secondary” method has been adopted to deploy sub-clusters, and redundancy storage will be used to prevent three back-ups of each data storage from missing. In other words, if two servers can not work properly, they can still provide completed data set to gurantee the whole cluster work properly. It adopts high efficient and realiable Zookeeper cooperative working system, which based on Fast Paxos, to maintain configuration information, select leader to gurantee the file writing consistency in distributed environment.The throughput capacity and stability of the system has proved to meet expectations through online environment test on a well-known Internet enterprise and data analysis. Every coin has two sides. Using Twemproxy can promote HA, but instead Twemproxy can loss some Level db properties due to the fact that Twemproxy needs some support from hardware resources. Though the experimental data meets expectations, further researches and studies are expected to reduce the loss of properties of Level db to perfect the whole system.

Keywords/Search Tags:

Distributed storage, Level db, Twemproxy, Clusters

PDF Full Text Request

Related items

1	Design And Implementation Of The Agent System Of Redis Cluster
2	The Design And Implementation Of Distributed Storage Subsystem Based On The Two Level Mapping
3	The Research On Distributed Storage Of Massive Datas Of Air Logistics Based On NoSQL
4	Research Of Endurance-Aware Data Layout For SSD Storage Clusters
5	Design And Implementation Of Multi-level Cache On Distributed Databases
6	Optimization Of Image Storage And Transmission For Virtual Clusters
7	Design And Implementation Of Distributed Key-value Storage System
8	Research On Optimization Model Of Distributed Storage And Cache
9	Design And Implementation Of Ciphertext Distributed Cloud Storage Scheme Based On Blind Storage
10	The Research On Large-Scale Distributed Storage Technology