Font Size: a A A

Research On Logistic Regression And Its Parallel Implementation On GPU

Posted on:2017-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:X H DongFull Text:PDF
GTID:2348330503986896Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Logistic regression is an important classification method in the field of machinelearning. Due to the simplicity of its model and the fast training speed, it has a wide range of applications in Internet, Finance, Medical treatment, etc. The main step in the training process of logistic regression is to use iterative m ethod to update the parameters. With the increase of the size of data in practical use, it needs a higher demand in training process. In recent years the general computing model based on GPU has gradually become a research focus, which can be exploited to accelerate the training process of logistic regression. This paper implements logistic regression and regularized logistic regression based on gradient descent method and improves the algorithm by solving the problems encountered in gradient descent method. In addition, by exploiting the GPU's hardware characteristics, parallel logistic regression is implemented in this paper. The details are as follows:To solve the problem that the speed of convergence becomes slower and slower,which is a general problem in the later period of the training process, this paper presents an improved method based on the rate of convergence in the target function.This method first calculates the convergence rate between two training process respectively, and then updates the step size under a given frequency and size. This algorithm has a remarkable improvement on the speed of convergence, and helps to reduce the training time.As to the problem that the sign function used in L1 regularized logistic regression can't effectively generate sparsity, this paper puts forward an improved sign function for L1 regularized logistic regression. Since the regularized term isn't derivative, a common method is to plug in a sign function which only focuses on the symbols of parameters. While in the improved method, it calculates the change of parameters' symbol and uses it to determine the final parameters' symbol. The new sign function enables L1 regularized logistic regression to generate a good sparsity,which can be used for selecting features.This paper implements the parallel logistic regression algorithm on GPU by exploiting the hardware characteristics of GPU. Compared with the stochasticgradient descent method, steepest gradient descent method uses all samples to train each time, and therefore it shows a good possibility to accelerate. Moreover, t he experiments on large-scale samples and high-dimensional samples show that this algorithm can get a good acceleration ratio.
Keywords/Search Tags:logistic regression, gradient descent algorithm, regularized, gpu parallel computation
PDF Full Text Request
Related items