Font Size: a A A

Research Of Massive Historical Database System

Posted on:2013-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2248330392956136Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of social economic becoming faster, the state make a long-termplan in all filed of social production. In the industrial filed, the government proposes agoal to speed up the integration of informatization and industrialization, hoping to utilizeinformation technology to promote sustainable development of basic industry. As a majorinformation technology, the real-time historical database has been widely used in theelectricity, communications, industrial control and other areas. Enterprises have to dealwith more and more data when their scales of production grow rapidly.The growth of information data brings higher requirement to real-time database bothin data storage management ability and data access ability. In real time informationcollection and monitoring process, historical database has to deal with millions of dataevery day. Facing huge data quantity, real-time database urgently needs to solve the dataprocess and storage problem, on one hand, to ensure data processed under time constraint,on the other hand, provides flexibility storage space for massive data. How to supportmassive data is the key point in research of historical database, the existing large datastorage solutions such as distributed database and cluster cannot meet the historical datastorage requirements. Based on the analysis of the current cloud computing technology forlarge data process, the paper designs a historical database system on Hadoop platform.Aim at the three key problems in historical database system realization process:distributed data storage, distributed data index, data parallel query, this paper analyzes thecurrent solutions, including the distributed data storage and distributed indexingtechnology, as well as the data storage, data index and query in cloud computingframework. According to the real-time data processing scenario, this paper proposessolutions to the three key points combining merits of the current technology. Store data inHDFS by dividing them into blocks, and establish multi-level index structure, which candynamically adapt to the change of HDFS, for searching history data of tags. At last, thepaper proposes a scheduling strategy to meet transaction deadline and optimize thescheduler for historical data querying transaction using the multi-level index.After solving the key technology problems, this paper presents the general structure ofmassive historical database system. And design the other modules which are necessary tocompose an integrated system, they are: metadata management, data collection, datastorage management, index management and transaction management, then clarify thefunction composition and system structure of massive real-time database system.
Keywords/Search Tags:Historical Database, Cloud Computing, Data Storage, Distributed Index, Transaction Scheduling
PDF Full Text Request
Related items