Font Size: a A A

Research On Parallel Nonlinear Least Squares For High Dimensional Data Classification

Posted on:2015-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhaoFull Text:PDF
GTID:2298330431493641Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the arrival of the era of big data and the development of cloud computingtechnologies, high-dimensional data processing has penetrated into lots of aspects ofresearch and life, playing a crucial role in many fields such as scientific research,biomedicine, network communications, etc. As one of the classic methods of dataanalysis, traditional classification methods show poor ability in high-dimensional dataprocessing, and many algorithms which work well in low-dimensional dataclassification need to be improved when facing high-dimensional data. Therefore,how to build an effective classification algorithm for the high-dimensional data is anextremely urgent problemThis paper systematically introduced and analyzed the advantages anddisadvantages of several basic and corresponding improved classification algorithms,at the same time, the common characteristics of dimension reduction methods forhigh-dimensional data classification are summarized. Also, the limitations of thesemethods in applications were discussed. Combining the advantages of traditionalexisting algorithms, a new algorithm named PNLS for high-dimensional dataclassification were proposed and also an improved version of PNLS named RPNLSwere demonstrated. Finally, based on the performance assessment of these algorithms,it shows that the new algorithms performance better in functions and accuracy. Themain results of this study can be summarized as follows:1To solve the inefficiency problem of Least square (LS) for high-dimensionaldata processing, a new parallel and nonlinear least squares classifier (PNLS)algorithm was proposed with the merits of the parallel method. The PNLS partitiondimensionality randomly, and obtain local model parameters in parallel, then combineparameters to form the final global solution. At last, the parameters are improved bythe iterative refinement process, resulting in the evident improvement in efficiency.2Based on the PNLS, an improved version named RPNLS was proposed. Theefficiency of PNLS was further improved in a way of replacing the equally divideddimension at the start of the iterative process with randomly divided dimensional data.3The evaluation experiments for the performance of these new algorithmswere completed. As a reference of least squares method, a common set ofhigh-dimensional data was chosen as experimental sample to assess the learning efficiency and prediction performance. The experimental results show that theproposed PNLS, RPNLS methods possessing good convergence character, have agood time advantage compared with the LS, furthermore, better prediction accuracycan be expected. Especially, RPNLS showed more excellent time advantage. Thesetwo methods can provide a significant candidate for high-dimensional dataclassification.
Keywords/Search Tags:the least square, parallel, high-dimensional, classification
PDF Full Text Request
Related items