With rapid development of information age today, the Internet gradually penetrates into every aspect of people’s life. Accordingly, all kinds of new applications emerge one after another, which will produce huge amounts of data every day. Due to the rapid growth of data coming from both storage contents and access requirements at the same time, the data storage system based on the single-node server cannot satisfy the user needs anymore. So it is a hot issue at present that how to provide a high-reliability, high-performance and high-scalability distributed storage service for the user and upper applications.Distributed structured storage is a system which uses distributed method to construct the underlying storage, but it provides users an efficient experience like operating in a single-node environment. By studying the current distributed storage technology, we have developed the distributed storage system based on multi-node for structured data.By analyzing the existing storage technology and related cloud database system, this thesis comes up with a multi-node storage architecture named DRDS. This system is based on database-sharing and supports related operation. The design idea and implementing method will be specifically presented in the following parts.1. The thesis designs a distributed storage system, which constructs the wholestorage architecture on multi-node by adopting distributed method. It dividesthe data operation into several independent sub-operations that arecorrespondingly completed by each server node in system.2. This thesis studies the SQL grammar in detail and sets up a parser of SQL, byanalyzing the distribution information of data tables and creating more efficientdistributed query. For this, we provide a detailed analysis of distributedtransaction and design a scheme to support it, combining with the actualsituation.3. In this thesis, the design architecture of high-concurrency networkcommunication will be realized. We design and implement the architecturebased on the multiplexing mechanism Epoll, provided by Linux. It will be ahigh-quality network communication design for lower levels. This architectureuses non-blocking I/O, supports high concurrency visits, and successfullydivides the lower layers from upper layers.Experimental results show that the storage system is fully functional and it has an excellent property, which can accomplish to storage a large-scale data in reasonable time. |