Font Size: a A A

Research On Key Technologies Of Data Management For Data-Intensive Applications

Posted on:2014-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:D H YuFull Text:PDF
GTID:2268330401482674Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In contemporary society, data are big, heterogeneous, semi-structured or unstructured. Through network, web services based on the massive data mining, analysis and processing become the trend of information society development. The data-intensive applications for massive information have caused widespread concern. Data management for data-intensive applications has many difficulties, such as effective big data storage, real-time updates of dynamic and heterogeneous data, view and index of big data, and parallel search, query and analysis. Therefore, for data management of data-intensive applications, this paper researches key technologies of data storage, query and data services.Firstly, this paper builds a scalable, efficient data management model for data-intensive applications (DIA-DM). On this model, services of node architecture-Double NameNode (DNN), data layout-Hybrid Data Layout (HDL), data index, data compression and data access are created.Then, this paper designs a query mechanism based on DIA-DM (QueryM), which is a SQL to MapReduce translator. QueryM uses a series of rules to translate a complex SQL query to MapReduce jobs as little as possible. The core of QueryM is merging algorithm for MapReduce jobs (MR-JM), which merges MapReduce jobs according to for merging rules in Query M.Finally, this paper designs and implements a typical data-intensive application-traffic data application platform, which uses vSphere for hardware architecture, HDL for data storage and QueryM for data query. This platform is mainly divided into three modules, which is data management, information query and data statistics. Through application analysis, this paper explains the actual availability7and query efficiency of this platform.
Keywords/Search Tags:data-intensive, heterogeneous data, HDL, query mechanism, parallel search
PDF Full Text Request
Related items