Font Size: a A A

The Index Management System For Massive Heterogeneous Historical Data Querying

Posted on:2014-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:B XuFull Text:PDF
GTID:2268330422451991Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the coming of big data era, huge amounts of data emerge and grow rapidlyall the time. And the data formats are also rich in species. With time passing,suchmassive heterogeneous data will become historical data and accumulatedin aninstant. In many industries, the volume of business on historical data querying isgrowing. Therefore,massive heterogeneous historical data query platformdevelopment has become an emerging developmentgoal of major softwarecompanies. In this hybrid query platform, the index management system can provideeffective management and maintenance of various types of indexes for querying toenhance the massive heterogeneous historical data query efficiency.The project comes from one project of hybrid data query platform developmentwhen I worked as an intern. The project objectives are focused on the functionalextensions of its database products and it supports the massive heterogeneous dataquerying.During the internship, I come up with a massive heterogeneous historicaldata query efficiency improvement solution with the indexing mechanism. That isthe index management system.The project mainly contains the following parts.Principle analysis of index management system. Introduction and analysisofindexing technology, index types and usage in DB2and MongoDB database,Hybrid architecture and other technologies is made. Then make a demonstration onJSONVal () function, which is used for hybrid querying. On this basis, the overallanalysis on index building and maintenancein management system is made.Design and development of index management system.The system mainlyconsists of five functional modules: SQL parsing module, index analysis module,index building module, queryset management module, and index maintenancemodule. Through interaction and cooperation between these modules,the systemsupports parsing the user queries,analyzing the types of indexes,building differenttypes indexes in different databases,management of Queryset,and maintaning of theexisted indexes.This system uses Java programming language for systemimplementation,and uses DB2and MongoDB databases as the heterogeneous datapersistence.In addition,the users also can munipulate all the functions in the indexsystem through the webconsole interface.Scenario design and development of index management system. A doctorworkstation application is developed by the combination the index managementsystem with healthcare scenario. In this scenario, users can use query scentenceswith business meanings to query data and manage or maintain indexes through SQLand SQL-like interface. Index management system performance testing.In the healthcare scenario, byusing the specified business and related data in this scenario,the performance testingof the index system is made. And test results shows that the index managementsystem can significantly improve the efficiency of data querying, which verifies thevalidity of the index management system.The index management system is developed in IBM China Development Lab. Itcomes from the real project with far-reaching significance. Finally,through thescenario verification and signifcant performance testing results,it proves the validityof the index management system. And the index management system is recognizedby my internship company.
Keywords/Search Tags:index management system, massive data querying, heterogeneous data querying, historical data querying, query efficiency
PDF Full Text Request
Related items