Font Size: a A A

Research Of Data Grouping Parallel Algorithms Based On Share Memory

Posted on:2016-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:J B LaiFull Text:PDF
GTID:2308330479494830Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Group-by is a common operation in data processing.According to statistics,More than43% queries relevant to data grouping.Therefore, It is important to improve the efficiency ofdata grouping so as to improve the efficiency of query processings in database system.In this work,we study deeply in data grouping on shared memory multi-processor systemafter an investigation of the related research both at home and abroad.We optimize thegrouping strategies based on merge sort and hash table,introduce a new grouping algorithmbased on thd idea of counting sort,and the relevant test results are given.The main contributions of this paper include:1.Optimize merge sort grouping through improving and parallelizing the algorithm.Theoptimized performance 4 times higher than quicksort grouping around;2.Propose a grouping algorithm based on the idea of counting sort, called the SegmentalCounting Sort,and proposes parallelization scheme for the algorithm. Compared with theqsort grouping on performance, segmental counting sort is about 6 times higher, and itsparallelization implement with 12 threads is as high as 35 times.3.Research on grouping algorithms based on hash table,and implement the algorithm in ashared table way and in a independent table way respectively;By optimizing the cache accessbehavior to improve the performance of independent table grouping algorithm;Implement thehash-table-based algothrims in the MIC platform,and optimize them from memory access andvector. The experimental results show that, the performance of data grouping based onindependent hash table in parallel is 3 times that of the serial implementation on MICplatform, is 1.9 times of the parallel implementation under the 6 cores system.
Keywords/Search Tags:group-by, parallel algorithm, MIC, merge sort, multistage counting, hash table
PDF Full Text Request
Related items