Font Size: a A A

Research On Main Memory Database Indexing Based On GPU

Posted on:2014-09-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:1268330425476726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As memory database is more efficient than disk-based database at query speed andconcurrency, the main memory database (MMDB) has been applied in amounts of financialfields such as bank, on-line shopping, and stock exchange and so on, which all contain vastvolume of data and need high availability of real-time operations.The indexing can effectivelyreduce the search space of the data, and improve the query efficiency of MMDB, however itfaces the challenges of performance and efficiency.General purposed computing on graphics processing unit (GPGPU), which is also a hotarea of research, has significant research value and prospect for application.. Indexing ongraphics processing unit (GPU) has got several related scientific research achievements, yetthere are many problems to deal with such as: underutilization of hardware resources, the lowparallel degree of the algorithm, lack of scalability and not effectively addressing index dataupdate and so on. Therefore, how to take full use of GPU hardware resources and maximizememory database index operation performance are the main research contents. Above all, onthe basis of the relevant research, the main works of this paper are as follows:1. The current studies of indexing technology of MMDB have been summarized. Thehardware characteristic and the programming technology of GPU have been reviewed.2. A parallel processing solution of T-tree based on GPU has been proposed, whichcreates the maximized T-tree parallelism on GPU platform by analyzing the father-sonrelationship among the internal nodes of T-tree. The dynamic array has been designed so thatthe T-tree indexing can be expanded and contracted on GPU platform to solve the problemthat the display memory on GPU space cannot be dynamically allocated. Through analyzingdifferent T-tree building programs theoretically and empirically, the proposed solution has anobvious advantage of operational efficiency and space utilization over traditional T-treebuilding program. In order to eliminate the bottleneck of GPU program data transmission, themethod of page locking has been used to promote the data transfer rate between CPU andGPU. To meet the needs of hardware development in the future, the scalability of thealgorithm has been studied. The T-tree traversal algorithm based on GPU T-tree has beenproposed to verify the solution. To verify the validity of the parallel program, the empiricalstudy has been done.3. In order to speed up the performance of multi-dimensional data manipulation, a multi-dimensional linear hashing processing solution has been proposed based on GPU. Thesolution expands the traditional data structure of hash table and uses two-layer data structure to scale arbitrarily on GPU, and solve the problem that multi-dimensional data cannot updateefficiently. In the hash table with the algorithm of bulk insertion of records, processing speedcan be accelerated by parallel splitting hash table and promote the efficiency of insertion. Aflexible overflow bucket management mechanism has been designed to improve the hashindex storage space utilization on GPU. Bulk records insertion program has been analyzed atalgorithm time and space complexity. To verify performance of the bulk record insertionmultidimensional linear hash index on a variety of hardware platforms, bulk deletion andparallel query operating, the related empirical study has been done.4. Cache sensitive CSB~+-tree index unlock parallel program based on GPU has beenproposed, and it fulfills the CSB~+-tree structure automatic update on GPU by improvingtraditional CSB~+-tree structure. Parallel unlock algorithm based on the tree layer and node keyhave been proposed and the latter can realize the maximum parallel degree construct. Byadding stuffing bits to CSB~+-tree internal nodes to reduce the branch thread of GPU threadblock, and thus improve the CSB~+-tree query performance. By analyzing the feasibility ofCSB~+-tree query algorithm with shared memory, it is pointed out that the thought oftraditional cache sensitive tree technology is not suitable in complex GPU memory frame. Toverify the feasibility of proposed program, the related empirical study has been done ondifferent platforms.5. BD-tree index parallel computing program on GPU fulfills the parallel processing byimproving traditional hash function and fulfills the update of BD-tree index data on GPU byimproving traditional BD-tree structure. By analyzing the tree structure of BD-tree, it canparallel build the internal structure of BD-tree based on node key. By increasing the spaceoverhead to reduce the GPU atomic number of function calls and improve the BD-tree datainsertion efficiency of the hash table significantly. By analyzing the spatial complexity of BD-tree parallel building algorithm, the space utilization of proposed algorithm is significantlyimproved compared with traditional algorithm. To verify the effectiveness of proposedprogram, the related empirical study has been done.
Keywords/Search Tags:Graphic Processing Unit (GPU), Main Memory Database (MMDB) Indexing, T-Tree, Multi-Dimensional Linear Hashing, CSB~+-Tree, BD-Tree
PDF Full Text Request
Related items