Font Size: a A A

Research On Robust Estimation Of Factor Number In Data-Driven High-Dimensional Factor Model

Posted on:2024-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2530307091488024Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology,the cost of data collection continues to decrease,and the scale of data is increasing.The associated problem is that the data to be pro-cessed often exhibits high-dimensional characteristics,and phenomena such as data pollution and missing values frequently occur,which further exacerbate the difficulty of data processing.Factor analysis,as an effective dimension reduction tool,has been widely used in the field of high-dimensional data analysis.However,the effectiveness of factor analysis implementation usually has a strong correlation with the selection of the number of common factors.Therefore,this thesis studies the construction of a robust estimation model for the number of factors in factor analysis,as well as corresponding algorithm design,analysis,and verification,in both complete and missing high-dimensional data backgrounds.First,this thesis uses optimization theory to model the problem of estimating the number of factors as a minimum trace factor model based on the covariance matrix.Considering that covariance estimation in high-dimensional data will lead to information uncertainty,this the-sis introduces Kullback-Leibler divergence to characterize this instability,and finally obtains a minimum trace robust factor model based on Kullback-Leibler divergence.To solve this opti-mization model,this thesis adopts the alternating direction method of multipliers based on the symmetric Gauss-Seidel decomposition,and analyzes the effectiveness and convergence of this algorithm.Since the robust estimation model for the number of factors designed in this thesis involves the estimation of the precision matrix,there is less research in this area in the context of missing value data.Therefore,this thesis designs a robust estimation model for the precision matrix based on missing values.This estimation model not only uses boundary constraints to ensure the uniqueness of the global optimal solution,but also considers the robustness of the model to resist the risk brought by the uncertainty of covariance estimation in the model.At the same time,the thesis chooses to use the alternating direction method of multipliers to solve this model,and analyzes the convergence of the algorithm,proving that it has global convergence and linear convergence rate.In order to verify the effectiveness of this model,the thesis compares it with two other estimation models through numerical experiments,demonstrating the robustness and effectiveness of the precision matrix estimation model designed in this thesis.To verify the effectiveness of the minimum trace robust factor model based on Kullback-Leibler divergence for estimating the number of factors,this thesis considers two different high-dimensional data backgrounds: complete and missing data,and conducts extensive numerical comparisons with the classical minimum trace factor model.The experimental results highlight the effectiveness and robustness of the factor number estimation model constructed in this thesis.Finally,empirical evidence based on real datasets again shows that this factor number estimation model can effectively estimate the number of factors in both complete and missing data.
Keywords/Search Tags:Factor Model, Factor Number, Kullback-Leibler Divergence, Symmetric Gauss-Seidel Decomposition, Global Convergence, Precision Matrix Estimation
PDF Full Text Request
Related items