The Design And Implementation Of A Distributed Structred Storage Scheduling Server

Posted on:2014-01-03

Degree:Master

Type:Thesis

Country:China

Candidate:W Du

Full Text:PDF

GTID:2268330401465569

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid popularization of the Internet and the increase of user quantity，more and more enterprises are gradually lose the ability to control the data. Traditionaldatabase systems cannot provide a guarantee of performance, reliability and scalabilitywhen the scale of data grows so fast. The putting forward of cloud computingindicates a new direction for the development of traditional database.Since Googleformally proposed the concept of cloud computing in2006for the first time, after6years of development, the cloud computing has gradually been recognized by theacademia and industry, and becomes a new developing trend of IT technology. Thecombining of the cloud computing and the traditional distributed storage technologyproduces a new data service model named Cloud Database, which gradually overcomesthe shortcomings of scalability and management of traditional database. Constructing anew high performance, high scalability and easy-to-manage Cloud Database systemwith traditional database technology, distributed storage technologies and cloudcomputing has become one of the hot research topics.By discussing the existing Cloud Database system, the thesis firstly proposes aCloud Relational Database framework (CRDB), which based on a shared-nothingarchitecture, and supports relational operations. CRDB is comprosed of a separatemetadata server, an index server, a scheduling server, a front-end server which supportsa variety of other popular database protocols and a lightweight relational databaseengine. Then, based on the actual work in the project, the thesis describes the design andimplementation of the core features of the scheduling server in detail. The main featuresof the scheduling server are as follows:1. Supports standard SQL subset. Based on the project requirements, the thesis uesesLex and Yacc build an SQL parser, supporting for basic SQL operations.2. Design a parallel distributed query plan. According to the MapReduce thinking,SQL semantics and the data distribution, the thesis designs a distributed query planwhich is independent of the query language and the storage engine. Based on the syntaxtree of the SQL, it’s easy to convert a SQL clause to the distributed query plan. 3. Support high concurrency network communication. Based on single thread Epoll,non-blocking I/O, and asynchronous callback function mechanism, the thesis design ahigh-performance network communication framework. All the tasks of the schedulingserver are driven by this framework. Meanwhile, a pipeline-based timer is provided toensure every task will terminate within a limited time.The thesis constructs a parser of an SQL subset through detailed analysis of theSQL syntax, and proposes a kind of distributed query plan expression, finally based on ahigh-performance network communication framework, implements a distributed queryframework. Through the testing, the framework supports the basic SQL query, but theperformance is still to be imporved.

Keywords/Search Tags:

Cloud Database, SQL parser, Distributed, Query plan

PDF Full Text Request

Related items

1	A Research Of Generation And Optimization Of Query Plan Based On Graph Database
2	Research On Data Query Processing And Optimization In Distributed Database
3	Query System Over Distributed Memory Cloud
4	Improvement And Application On Query Algorithm Of Distributed Database
5	Distributed Database For Read-only Application Of The Model Structure And Query Optimization
6	Design And Implementation Of Query Optimization Module For Distributed Column Database Based On Memory
7	Relational Database Query Optimization Technology To Achieve
8	Research On The Collaboration Query Processing For Cloud Data
9	Distributed Joins And Optimization For BIG Table Based On Database OceanBase
10	Research On Query Task Scheduling Method Of Distributed Database