Font Size: a A A

Research On Distributed Storage Technology Based On Mass Data

Posted on:2014-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:C C LiFull Text:PDF
GTID:2248330398470742Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the recent years, with the burgeoning development of the information technology, the data on the Internet is growing in an incredible speed. There is a continuing increase in the Internet business, the number of Internet users and the space of online storage. However, the storage capacity is inversely proportional to the storage performance. As the traditional centralized database can hardly deal with the huge amount of data, it failed to meet the expanding demands of abundant information and high system performance. Therefore, mass data storage became a key research topic and MPP (Massive Parallel Processing) architecture-based parallel processing distributed database is one of the related research directions. Based on the subject of "Research on key technologies of safety trusted telecom-level operation supporting architecture on reproductive health services", this paper mainly focuses on the mass data storage technology. It aims to provide a storage solution with high concurrency, high availability, and high scalability.The present study has addressed:1. Summed up the mass data storage and the corresponding application of new technology based on the massive data storage technology, relational data, NoSQL data model, distributed database storage and MPP architecture-based parallel processing mode theory;2.Analyzed the characteristics of mass data storage technology, compared the advantages and disadvantages of distributed mass data storage technology commonly used at home and abroad, and designed the distribution of mass data storage model. The system is composed of four modules:SQL parsing module, sharding module, parallel query module, and results summarizing module; and3.Combined with existing distributed database design method, independently developed the storage system of "DB Mapping" based on MPP architecture which has good scalability and the advantages of highly efficient processing.The primary contributions of this paper are summarized as follows. We proposed a mass data storage solution based on MPP parallel processing and provided a complete process of the data storage from the client request to the database. By integrating the MapReduce thought, the system can work on the distribution data node and satisfy the demands of high scalability, high availability and high concurrency. The feasibility of this solution was verified by a simulation test.
Keywords/Search Tags:Mass data storage, distribute database system, MPP architecture parallel processing
PDF Full Text Request
Related items