Font Size: a A A

Research Of Storage Allocation Strategy In Parallel Data Processing Middleware

Posted on:2009-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:X Z JiaoFull Text:PDF
GTID:2178360272979847Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Parallel data processing is a very important computer technology and plays a significant role in many fields. A middleware supporting parallel data processing is developed to connect databases which locate on the nodes of parallel computer system based on clusters. The nodes of the system work parallel and could achieve performance close to PDBMS. In PDBMS based on SN structure, query is completed by more than one node. In this application circumstance, data declustering is significant to improve system parallelism, reduce data skew, enhance system performance.Data storage allocation is the foundation of parallelism that query processing could have as well as the important direction of research of parallel data processing. By studying the strategies of data storage allocation, the strategy which adapt to middleware system mostly was proposed to enhance the system performance.The structure and work principle of parallel data processing middleware are introduced firstly, under this application and research circumstance, how to partition a relation is worked over and a algorithm is present to select partition key based on join cost. The Range and Hash algorithms are reformed and the R-H algorithm is put forward which adapt to our system more than others, furthermore, the R-H algorithm is in favor of achieving load balancing as well as avoiding to produce early data skew. Concerning the data skew after system run a long time, the data re-distribution strategy was present in middleware system. The thesis study how to find overloading nodes, relations and data blocks when data skew occurs, the algorithm to transplant the hot data blocks was present. The test confirms that re-distribution algorithm could resolve data skew problem efficiently.The thesis also introduces methods to resolve the problems of placement of small table, creation of index and join operator. That all make the research of this thesis have high-availability.
Keywords/Search Tags:Parallel Data Processing, Middleware, Data Declustering, Storage Allocation
PDF Full Text Request
Related items