Font Size: a A A

Research On Integrated Algorithm Of Locality Sensitive Hashing And Matrix Factorization On GPU Platform

Posted on:2021-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z X LiFull Text:PDF
GTID:2518306122464094Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Matrix factorization(MF)can extract low-rank features and integrate information of data manifold distribution in high-dimension data,which can consider nonlinear neighborhood information.Thus,it has drawn wide attentions on low-rank analysis for sparse big data,i.e.,Collaborative Filtering(CF)recommender systems,social networks,and Quality of Service(QoS),etc.However,due to the huge computational overhead for the construction of Graph Similarity Matrix(GSM)and huge memory overhead for the intermediate GSM.GSM-based MF,e.g.,kernel MF,graph regularized MF,etc,cannot be directly applied to low-rank analysis for sparse big data on cloud and edge platforms.This research proposes Locality Sensitive Hashing(LSH)aggregated nonlinear MF(LSH-MF)which has the following advantages:(1)LSH-MF integrates a nonlinear neighborhood model that focuses on strong correlation information and a matrix decomposition model that focuses on weak correlation information,and has a more powerful ability to extract features.Thus,LSH-MF has higher prediction accuracy.(2)LSH-MF uses the probabilistic projection strategy simLSH to search the neighborhood,avoiding the construction of GSM,saving a lot of calculation overhead,and simLSH can meet the requirements of accurate projection of sparse large data.(3)LSH-MF can update the model online based on the incremental data,making it avoid the model retraining and more adaptable to the production environment.At the same time,in order to further shorten the calculation time of LSH-MF,this research proposes CULSH-MF based on CUDA parallelization,which implements LSH-MF fine-grained parallelization and online learning on the GPU and multi GPUs.At last,this paper conducts the experiments for QoS on two public datasets and CF on three public datasets including accuracy and efficiency tests.Experimental results show that the CULSH-MF can not only reduce computational time and memory overheads,but also obtain higher accuracy than using GSM.
Keywords/Search Tags:Graph Similarity Matrix(GSM), Locality Sensitive Hashing(LSH), Matrix Factorization(MF), Top-K Nearest Neighbors
PDF Full Text Request
Related items