Research Of Massive Historical Database System

Posted on:2013-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:Z Wang

Full Text:PDF

GTID:2248330392956136

Subject:Communication and Information System

Abstract/Summary:

With the development of social economic becoming faster, the state make a long-termplan in all filed of social production. In the industrial filed, the government proposes agoal to speed up the integration of informatization and industrialization, hoping to utilizeinformation technology to promote sustainable development of basic industry. As a majorinformation technology, the real-time historical database has been widely used in theelectricity, communications, industrial control and other areas. Enterprises have to dealwith more and more data when their scales of production grow rapidly.The growth of information data brings higher requirement to real-time database bothin data storage management ability and data access ability. In real time informationcollection and monitoring process, historical database has to deal with millions of dataevery day. Facing huge data quantity, real-time database urgently needs to solve the dataprocess and storage problem, on one hand, to ensure data processed under time constraint,on the other hand, provides flexibility storage space for massive data. How to supportmassive data is the key point in research of historical database, the existing large datastorage solutions such as distributed database and cluster cannot meet the historical datastorage requirements. Based on the analysis of the current cloud computing technology forlarge data process, the paper designs a historical database system on Hadoop platform.Aim at the three key problems in historical database system realization process:distributed data storage, distributed data index, data parallel query, this paper analyzes thecurrent solutions, including the distributed data storage and distributed indexingtechnology, as well as the data storage, data index and query in cloud computingframework. According to the real-time data processing scenario, this paper proposessolutions to the three key points combining merits of the current technology. Store data inHDFS by dividing them into blocks, and establish multi-level index structure, which candynamically adapt to the change of HDFS, for searching history data of tags. At last, thepaper proposes a scheduling strategy to meet transaction deadline and optimize thescheduler for historical data querying transaction using the multi-level index.After solving the key technology problems, this paper presents the general structure ofmassive historical database system. And design the other modules which are necessary tocompose an integrated system, they are: metadata management, data collection, datastorage management, index management and transaction management, then clarify thefunction composition and system structure of massive real-time database system.

Keywords/Search Tags:

Historical Database, Cloud Computing, Data Storage, Distributed Index, Transaction Scheduling

Related items

1	Cloud Computing Environment Index Of Gml Spatial Data Storage Mechanism Research
2	Research Of High Performance Data Storage And Retrieval Of Distributed Real Time Database Based On Cloud Computing Technology
3	Research On Multidimensional Data Index In Cloud Computing System
4	Research On Data Computing Offloading And Distributed Cloud Storage Management Issues In Mobile Cloud Environments
5	Research On Optimization Of Map Reduce For Interactive Analysis On Big Data
6	Distributed Database Cluster System Zd-ddb Design And Implementation
7	Research On Efficient Data Storage And Intelligent Computing In Cloud Environment
8	The Systenm Upgrade Program Of Financial Information Database Based On Cloud Computing
9	Design And Implementation Of Distributed Graph Database Storage Engine Acceleration And Transaction Management
10	Research On The Storage Model And Scheduling Algorithm For High Resolution Data Based On Cloud Computing Platform