Font Size: a A A

Factor Models For High Dimensional Matrix-valued Data

Posted on:2020-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z G GaoFull Text:PDF
GTID:1480306197484524Subject:Statistics
Abstract/Summary:PDF Full Text Request
The factor model is a statistical model that assumes that the observational variables are affected by a few latent common factors.Factor model is one of the classical contents of multivariate statistical analysis,and has been applied to many disciplines,such as psychometrics,econometrics,genomics.However,essentially,the factor model has strong limitations.It only applies to vector-type data and requiring that individuals are independence or irrelevance or weak correlation from each other.For matrix data,factor model can not be applied directlyMatrix data,a common data type in the era of big data,whose rows and columns represent a class of variables(the two kinds of variables can be the same or different),and even each element in the matrix represents a variable.The traditional factor model is used to analyze the matrix data,assuming that it is irrelevant,so at least half of the information will be lost.Therefore,we propose and generalize a two-way factor model(2wFM)which is suitable for matrix data.Our model can independently extract the latent variables(common factors)of rows and columns in matrix data.Moreover,our method can also extract common factor information of rows and columns for non-repetitive high-dimensional data matricesThis paper focuses on parameter estimation and inference of two-way factor model.We study the maximum likelihood estimation of parameters and the large sample properties of the maximum likelihood estimation of parameters.For the estimation method of parameters,we obtain the maximum likelihood estimation of parameters under the assumption of two-way factor model.For the covariance matrix of vec(X)after straightening out X by column,it is denoted by S igmaX.Because the covariance matrix has special structure,it is difficult to write the analytic expression of likelihood function.Here,under the identification conditions of the parameters,we generalize the existing conclusion of calculating the inverse of a covariance matrix,and obtain the exact expressions of EX-1 and |EX|,so as to obtain the analytic logarithmic likelihood function.Based on the likelihood function,we propose a block optimization method for factor load parameters and variance parameters to calculate maximum likelihood estimates for each parameterFor the large sample properties of parameters,we obtain the consistency and asymptotic distribution(central limit theorem)of each parameter estimation under more general conditions.In our proof process,we have encountered unprecedented difficulties.Row factor loading and column factor interact together,and column factor loading and row factor interact together,and this situation becomes more difficult without replicate samples.In addition to other unknown terms related to loading estimation,it is difficult to obtain the large sample conclusions that we are concerned about directly.Based on the likelihood function and maximum likelihood estimation,the above difficulties prompt us to study the essential optimization of maximum likelihood estimation,and finally get an ideal conclusion.The final asymptotic distribution is quite different from that of the classical factor model.The asymptotic variance of row factor load is not only related to row factor variance,but also to the distance between row factor variance and column factor variance.The closer the distance is,the greater the asymptotic variance is;the farther the distance is,the smaller the asymptotic variance is.The asymptotic variance of column factor loads has the same phenomenonThis paper has made some breakthroughs in parameter estimation methods,statistical theory properties and applications.In parameter estimation,we give a consistent initial value selection method and an optimization strategy of block iteration,which can quickly obtain maximum likelihood estimation.In statistical theory,we propose a new heuristic method for large sample properties of maximum likelihood estimation of factor model,which is different from the previous one.In application,based on our model,we can estimate the factor effects of rows and columns at the same time.We analyze an air pollution data and draw many interesting conclusions.
Keywords/Search Tags:High-dimensional matrix-valued data, Factor model, MLE, asymptotic property
PDF Full Text Request
Related items