Robust Probabilistic PCA Based On Independent T Distributions

Posted on:2022-11-06

Degree:Master

Type:Thesis

Country:China

Candidate:X B Luo

Full Text:PDF

GTID:2480306749967069

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

Principal component analysis(PCA)is a traditional data dimensionality reduction method,which projects the original data into a lower dimensional space by maximizing the projection variance,so as to reduce the dimension and retain the original imformation as much as possible,but it is a non probabilistic dimensionality reduction method.Probabilistic principal component analysis(PPCA)provides a theoretical basis for the Bayesian method of PCA,and combines with EMalgorithm to greatly improve the computational efficiency of the model.PPCA is realized under the assumption of normal distribution,which is sensitive to outliers.In order to improve the stability of the model,probabilistic principal component analysis for t distributions is proposed.Because the t-distribution has a thick tail,when outliers occur,it is more robust than the normal distribution.However,in the t-PPCA model,each sample shares a weight.When there is a local anomaly in the observation data,the model will judge the whole sample as abnormal,thus wasting the useful information in the rest of the sample.In probabilistic principal component analysis for independent t-distribution,the t-distributions are modeled independently for each dimension of noise vector,and Bayesian method is used to estimate parameters.From the perspective of reconstruction,the problem of missing data with local outliers is solved,but it does not intuitively explain how independent t-distribution solves the problem of local outliers.The essential differ-ence between multivariate t-distribution and independent t-distribution in identifying abnormal elements is ignored,which is what we are concerned about.In this thesis,the model is applied to solve the parameters by variational EM algorithm,which improves the robustness compared with normal distribution.At the same time,it intuitively ex-pounds how the independent t distribution reduces the influence of local outliers from the perspective of element weight.Based on this model,using variational inference,this thesis deduces VBECM and PX-VBECM algorithms,and compares the two algorithms.Simulation experiments show that the convergence rate of PX-VBECM algorithm is higher than that of VBECM algorithm.At the same time,by comparing with PPCA and t-PPCA models,the results show that it-PPCA model can indeed increase the number of effective elements by giv-ing weight to each element of the observation data,so as to improve the robustness of the model.Through empirical research,it is found that it-PPCA model can also be ap-plied to face recognition.When there are outliers in a few dimensions,the classification error rate of this model is much lower than that of t-PPCA model in the same case,and even very similar to that of PPCA model without abnormal blocks.To sum up,it-PPCA model has stronger stability than PPCA and t-ppca models in the same case by controlling the weight of abnormal elements.

Keywords/Search Tags:

Independent t distribution, PPCA, Variational inference, Effective elements, Robustness

PDF Full Text Request

Related items

1	Robust Factor Analysis Based On Independent T Distribution
2	A Research Of The Prior Distribution Effect On Bayesian Variational Inference Of Erlang Mixture Model
3	Research On Robustness Of Causal Inference For Partial Linear Models
4	Stochastic Variational Inference For Probabilistic Model And Its Application
5	Research On Multi-Source Data Network Structure Inference And Balance Robustness Of Network
6	Research On Adaptive Variational Contrastive Divergence
7	Experimental Research On Free Space Measurement-device-independent Quantum Key Distribution
8	Statistical Inference Research Based On The Bayes Estimation Of The Parameters Of Burr X Distribution
9	Statistical, algorithmic, and robustness aspects of population demographic inference from genomic variation data
10	Stochastic Variational Inference For Gaussian Mixture Models With Unknown Number Of Mixture Components