Font Size: a A A

The Research Of Big Data Storage Technology In Cloud Computing

Posted on:2014-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:P J WangFull Text:PDF
GTID:2248330398472036Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
We are entering the era of big data. Firstly, the popularity of SNS and other Web2.0applications, the deployment and development of Internet of Things and the popularization of mobile communication and mobile Internet make the new data sources appear continuously. Secondly, high speed Internet enables easier data replication and transmission. Finally, the more historical data needs to be stored and processed. According to IDC statistics, the global data is about0.18ZB in2006, and the number will reach35ZB in2020. These big data consists of structured data and unstructured data.Now, the data storage technology can be divided into two categories: relational database technology and NoSQL database technology. Relational database technology has been researched for a long time. It’s relatively mature and widely used. It is often used to store structured data. Relational database provides strong atomicity and consistency, and the complexity grows with the scale of data storage. NoSQL is a data storage solution used to store and process big data. No fixed schema is needed in NoSQL, and SQL JOIN operation is not recommended. It thus has high scalability. Both data storage management technologies have their own advantages as well as drawbacks. How to make full use of their advantages in a system is an issue that is worth studying.Based on the above analysis, a probable solution to combine advantages of relational database and NoSQL database is presented in this thesis. It is the design and implementation of a middleware based on relational database and NoSQL database. Relational database and NoSQL database constitute data storage layer serving as the medium of data storage. The middleware that serves as the data operational interface, maintains correlation of the HBase and MySQL data and realizes data migration between two databases. The middleware includes configuration module, data association module, switch module, search module, read module, write module, delete module and data migration module. Using the interfaces provided by middleware, user program can arbitrarily define data schema and create relations of data to store structured data into database. User program can store unstructured data into NoSQL database. Data can also be simultaneously stored in relational database and NoSQL database, and middleware implements customized rules to ensure the data relations and associated data consistency. In addition, the interface of data migration between relational database and NoSQL database is implemented. Compared with the Sqoop, in order to conduct data migration, there is no longer any need of the data type judgment and conversion. Instead, the data is written into relational database according to the data type directly obtained from the column name of NoSQL database. It makes the process more efficient. In order to validate the performance and function of the storage middleware system, we have done some experiments. The result of experiments proves the flexibility and effectiveness of the middleware system.
Keywords/Search Tags:MySQL, NoSQL, HBase, Middleware
PDF Full Text Request
Related items