Font Size: a A A

Robust Principal Component Analysis And Its Application On Outlier Detection

Posted on:2022-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:H P WangFull Text:PDF
GTID:2518306542475814Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In order to mine the inherent value of high-dimensional data and improve the time efficiency of data analysis,it is necessary to extract features from the data.Principal component analysis algorithms can extract low-rank features of data,improve the computational efficiency of machine learning algorithms and the generalization ability of machine learning algorithms.The classical principal component analysis algorithm has the disadvantage of not being able to extract low-rank features of noisy data.Therefore,through the study of robust principal component analysis algorithms,it can solve the problem that traditional principal component analysis algorithms cannot extract low-rank features of noisy data.The traditional robust principal component analysis algorithm has the problem of low time efficiency.Based on this problem,The Newton-soft threshold iterative algorithm is proposed in this paper.The algorithm uses Newton's method to improve the speed of solving low-rank matrices,and the soft threshold iterative algorithm improves the speed of solving sparse matrices.At the same time,the Newton method and the soft threshold iteration method are used to greatly reduce the time complexity of the robust principal component analysis algorithm.Experimental data proves that the Newton-soft threshold iterative robust principal component analysis algorithm proposed in this paper can effectively solve the problem of low-rank feature extraction with noisy data.At the same time,it is proved by experiments that the newton-soft threshold iterative robust principal component analysis algorithm proposed in this paper Compared with the existing algorithm,the analysis algorithm has a significant improvement in event efficiency.Compared with the lower rank matrix fitting algorithm,the time efficiency of the algorithm is improved by 92.4%,and the time efficiency of the robust principal component analysis algorithm of the gradient descent is improved by 54.2%.At the same time,because the proposed soft threshold estimation operator improves the accuracy of the soft threshold iteration method,the Newton-soft threshold iteration algorithm proposed in this paper has higher accuracy in the experiments of video foreground and background separation and image denoising.In the image denoising experiment,the peak signal-to-noise ratio of the image calculated by the Newton-soft threshold iterative algorithm is 32,which is the highest among the comparison algorithms.It also proves that the accuracy of the Newton-soft threshold iterative algorithm has been improved.Traditional outlier detection algorithms on linear data have the disadvantage that it is easy to judge normal data as abnormal data.Robust principal component analysis algorithm can resist noise to extract low-rank features of noisy data.Based on this feature,this paper proposes a generalized robust principal component analysis algorithm.The generalized robust principal component analysis algorithm can decompose noisy data into two groups.Data,one group is pure noise data,and one group is normal data without noise.According to the definition of the generalized robust principal component analysis algorithm,the threshold of the classified data is calculated by the maximum likelihood estimation,and the data is divided into normal data and abnormal data according to the given threshold,so as to achieve the purpose of data anomaly detection.Finally,through experimental analysis,the generalized robust principal component analysis anomaly detection algorithm proposed in this paper can effectively solve the anomaly detection problem for linear data and protect normal data from being misjudged as abnormal data.At the same time,design experiments to compare the anomaly detection capabilities of the traditional anomaly detection algorithm and the generalized robust principal component analysis anomaly detection algorithm in this paper.The results prove that the generalized robust principal component analysis anomaly detection algorithm proposed in this paper can increase the true class rate by 99.8 %,which effectively protects normal data.At the same time,the algorithm proposed in this paper is the highest among all comparison algorithms with a correct rate of 91.1%.
Keywords/Search Tags:Robust Principal Component Analysis (RPCA), feature extraction, dimensionality reduction, outlier detection, machine learning
PDF Full Text Request
Related items