Font Size: a A A

Research Of A Parallel Data Incremental Processing Mechanism Based On NoSQL Database

Posted on:2014-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:W LiuFull Text:PDF
GTID:2268330422463475Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Cloud computing takes new opportunities and challenges for data processing. In thetime of big data, traditional RDMS cannot meet the requirement of high availability andreliability. NoSQL distributed database with high availability and high reliability can sa-tisfies the requirement of big data applications. However, the tradeoff of high performanceis to sacrifices the data processing ability of SQL. Therefore, how to enhance the dataprocessing ability of NoSQL has became important issues.The data processing ability of NoSQL can be improved from both off-line and on-linesides. On off-line processing side, the features of high availability and high reliability ofNoSQL are kept and the batch data processing ability can be enhanced by integratingNoSQL database with the open source MapReduce framework Hadoop. Hadoop job con-figuration module, data split module, data input and output module are built so that Ha-doop can take the advantage of accessing data in local database node and processing datastored in NoSQL database. On the side of on-line processing, firstly, we implementedmulti-row transaction based on single-row transaction in NoSQL. Furthermore, a triggerlikely mechanism called notification is implemented via adding redundant columns andregistering hook functions for system calls. According to the multi-row transaction algo-rithm and notification mechanism, users can use incremental data processing mechanismto meet the requirement of on-line data processing.A4,200,000records included data set is used as test data for all tests. Experimentsshows that the MapReduce-based data inserting approach is300%faster than the trandi-tional method. On the side of data processing, the performance of count, sort and group is30%~50%higher than Pig.
Keywords/Search Tags:NoSQL database, Incremental processing, Multi-row transaction
PDF Full Text Request
Related items