Font Size: a A A

Accelerating Exact Inner Product Retrieval By CPU-GPU System

Posted on:2020-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:L XiangFull Text:PDF
GTID:2428330590995221Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The inner product retrieval(IPR)is widely used in many applications,e.g.,recom-mendation system,information retrieval and data mining.Inner product retrieval is given a query vector q and a matrix P,find the k vectors in P,for which the inner product with query vector q are the largest in q~T·P.The IPR problem can be decomposed into two fundamental operations:(i)calculate the product of the query vector and matrix;(ii)retrieve the results to find the largest inner product value and corresponding vectors in ma-trix P.In the traditional CPU-based solutions,the performance bottleneck is the product computation phase,which takes up more than 96%of the total computing time.Com-pared with CPU,GPU has more computing cores and larger memory bandwidth,which is especially suitable for solving computationally intensive computing tasks.Exploiting Graphics Processing Units(GPUs)to accelerate the computation-intensive workloads is the gold standard in data mining and machine learning communities.However,it is not trivial to apply CPU-GPU systems to boost the performance of IPR solutions due to the natural complex of the IPR problem.This paper has been studied for how to using CPU-GPU systems accelerating exact IPR problem.The main research content and contribution of this paper are as follows:(1)Analyze the time cost of each phase in CPU-based IPR solutions.In order to solve the problem that the CPU cannot calculate matrix multiplication efficiently,this paper proposes to use CPU-GPU system to accelerate the calculation process of matrix multiplication in IPR.In order to make full use of the hardware resources of the CPU-GPU system,this paper systematically studies the characteristics of CPU-GPU system.And investigated the current calculation methods of CPU and GPU matrix multiplication.(2)This paper proposed an IPR algorithm using GPU to calculation matrix multi-plication.Due to the limited size of video memory,this paper proposed matrix partition strategy fully utilizing GPU to calculation matrix multiplication.Based on matrix partition strategy proposed GPU-based IPR algorithm,which solves the performance bottleneck of the CPU-based algorithm.(3)This paper proposes two IPR optimization algorithms based on GPU matrix multiplication.Firstly proposed an IPR optimization algorithm based on GPU sorting.It improves the efficiency of retrieving the k largest element in result matrix and reduce the overhead of transferring data between video memory and main memory.Secondly,improve the efficiency of by avoiding unnecessary computation cost with early termination technique.The core idea is exploiting Cauchy-Schwarz inequality for early termination.This paper demonstrates the efficiency of our proposal on four standard real datasets.The experiment results show that the algorithm proposed in this paper can effectively solve the performance bottleneck of CPU-based IPR algorithm.Compared with the current CPU-based IPR algorithm with the highest efficiency,the IPR algorithm proposed in this paper based on CPU-GPU system achieves a performance improvement from 15.88to 138.78 times.
Keywords/Search Tags:inner product retrieval, CPU-GPU system, Top-k retrieval, recommender systems
PDF Full Text Request
Related items