Font Size: a A A

The Design And Implementation Of A Distributed Structred Storage Scheduling Server

Posted on:2014-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:W DuFull Text:PDF
GTID:2268330401465569Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid popularization of the Internet and the increase of user quantity,more and more enterprises are gradually lose the ability to control the data. Traditionaldatabase systems cannot provide a guarantee of performance, reliability and scalabilitywhen the scale of data grows so fast. The putting forward of cloud computingindicates a new direction for the development of traditional database.Since Googleformally proposed the concept of cloud computing in2006for the first time, after6years of development, the cloud computing has gradually been recognized by theacademia and industry, and becomes a new developing trend of IT technology. Thecombining of the cloud computing and the traditional distributed storage technologyproduces a new data service model named Cloud Database, which gradually overcomesthe shortcomings of scalability and management of traditional database. Constructing anew high performance, high scalability and easy-to-manage Cloud Database systemwith traditional database technology, distributed storage technologies and cloudcomputing has become one of the hot research topics.By discussing the existing Cloud Database system, the thesis firstly proposes aCloud Relational Database framework (CRDB), which based on a shared-nothingarchitecture, and supports relational operations. CRDB is comprosed of a separatemetadata server, an index server, a scheduling server, a front-end server which supportsa variety of other popular database protocols and a lightweight relational databaseengine. Then, based on the actual work in the project, the thesis describes the design andimplementation of the core features of the scheduling server in detail. The main featuresof the scheduling server are as follows:1. Supports standard SQL subset. Based on the project requirements, the thesis uesesLex and Yacc build an SQL parser, supporting for basic SQL operations.2. Design a parallel distributed query plan. According to the MapReduce thinking,SQL semantics and the data distribution, the thesis designs a distributed query planwhich is independent of the query language and the storage engine. Based on the syntaxtree of the SQL, it’s easy to convert a SQL clause to the distributed query plan. 3. Support high concurrency network communication. Based on single thread Epoll,non-blocking I/O, and asynchronous callback function mechanism, the thesis design ahigh-performance network communication framework. All the tasks of the schedulingserver are driven by this framework. Meanwhile, a pipeline-based timer is provided toensure every task will terminate within a limited time.The thesis constructs a parser of an SQL subset through detailed analysis of theSQL syntax, and proposes a kind of distributed query plan expression, finally based on ahigh-performance network communication framework, implements a distributed queryframework. Through the testing, the framework supports the basic SQL query, but theperformance is still to be imporved.
Keywords/Search Tags:Cloud Database, SQL parser, Distributed, Query plan
PDF Full Text Request
Related items