Font Size: a A A

Efficient Cache Optimization Mechanism Of GPU Index Data Structure

Posted on:2023-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2568307097994849Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The core of efficient data retrieval and processing technology is the concurrent index data structure.The key of big data storage is to use the index structure to retrieve and process data efficiently.Therefore,it is very important to traverse the index effectively to locate the requested data.In recent years,due to the growing volume of data and the higher demand of applications for data processing performance,more and more researchers choose to use GPU with high parallelism to accelerate the operation of index structure.In order to match the thread structure characteristics of GPU,a large number of studies have optimized the traditional parallel index data structure in order to support the dynamic update of data structure on GPU and obtain good query performance.However,most of these index optimization schemes are only applicable to specific index data structures and do not have generality.At the same time,with the increase of index data volume,the limited GPU video memory capacity has become the main reason to hinder the improvement of index structure operation performance.This paper proposes an efficient cache optimization mechanism based on GPU index data structure.This mechanism is based on the CPU-GPU heterogeneous processing system.The main data is retained in the CPU memory,and the index operations(build,query,update and delete)are unloaded to the GPU.Aiming at the problem that GPU’s video memory capacity is limited and can not fully accommodate the index data structure,a shared memory cache table on GPU is designed and implemented.Specific historical access data is selected and retained in the cache table and supports high-speed access,so as to avoid the additional overhead of index path traversal and reduce the CPU-GPU data transmission when the video memory is insufficient.The main contributions of this paper are as follows:(1)For the high-throughput GPU B-tree and hash table realized in the latest research results,the operation mechanism of the two index data structures are analyzed respectively,so as to locate the key factors affecting the improvement of index operation performance.Based on the above analysis results,a general GPU cache optimization scheme independent of index construction and search algorithm is proposed.(2)Using GPU shared memory as cache space,a cache table supporting multi-threaded parallel access is established,and the original key object is encapsulated to make the cache table suitable for different types of data storage.In order to make the cache table have a high hit rate,a cache replacement mechanism is designed and implemented based on the idea of LRU algorithm.At the same time,a data consistency mechanism is proposed to maintain the synchronous update of data in the host side,device side and cache table.Encapsulate the three basic operations of the cache table,including query,insert / update and delete,and provide an open API for different index data structures.(3)The effectiveness and generality of the cache optimization mechanism of GPU index data structure are verified on GPU B-tree and slab hash index structures respectively.Based on the two dimensions of data size and cache size,compared with the batch query performance of the original index structure,the experimental results show that the performance improvement of cache on small and medium-sized query scale can reach 30%.By controlling the proportion of index data put into the video memory,it is verified that the cache can reduce the data transmission to a certain extent,especially when the video memory capacity is small,the cache can reduce the data transmission by nearly 50%.
Keywords/Search Tags:Parallel index structure, CPU-GPU heterogeneity, Cache algorithm
PDF Full Text Request
Related items