Font Size: a A A

Sufficient Dimensionality Reduction For Longitudinal Component Data

Posted on:2022-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:L Y FangFull Text:PDF
GTID:2518306746967979Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the progress of science and technology,it is easy to obtain data.The direct conse-quence is that the obtained data show the characteristics of large quantity and high dimension,and the curse of dimensionality often occurs.To deal with this problem and obtain more in-formation,the most effective method is to reduce the dimension of data.In many applications,especially in economics,data analysis usually includes various structural indicators,and the mea-sured data of these indicators are collected from countries(regions)every year(monthly),which falls into the formation of longitudinal compositional data.Huge data dimension makes the statistical analysis of longitudinal compositional data difficult.Considering sufficient dimension reduction has the advantages of not losing regression information and maintaining the internal structure of data,this paper uses this idea to conduct the sufficient dimension reduction on lon-gitudinal compositional data.Before dimension reduction,this paper first introduces a series of basic definitions,related operational properties,isometric logarithmic ratio(ilr)transformation in Achison simplex space,and gives the derivation of distance covariance function under longitudinal compositional da-ta.Then,referring to the dimension reduction idea of matrix-valued data,this paper proposes a method based on distance covariance to fully reduce the dimension of longitudinal compo-sitional data.Theoretically,this dimension reduction criterion can find the folded subspace of central compositional dimension in the overall form,and can realize the dimension reduction of time T and variable p at the same time.The estimation of the folded subspace of the dimen-sion of the central component obtained based on its sample form has the consistency of In the algorithm,by introducing the assumption of(p,T)separable,the dimension reduction process is transformed into a low dimensional optimization problem with constraints,which can be solved quickly by nonlinear optimization algorithm.Further,this paper proposes a consistent BIC information criterion to adaptively determine the structural dimension.In order to evaluate the performance of the proposed method under different conditions,this paper considers three numerical simulation examples:(1)balanced longitudinal compositional data with continuous response;(2)balanced longitudinal compositional data with discrete re-sponse;(3)unbalanced longitudinal compositional data.Three simulation examples consider the following two cases:(1)whether the covariance matrix satisfies the ‘(p,T)separable' condition;(2)different internal correlation structures of compositional data.The simulation results show that the proposed longitudinal compositional data adequacy dimension reduction method has ex-cellent performance in finite samples,and retains as much regression information as possible.The BIC criterion is also feasible and effective for determining the structural dimension.Finally,our method is applied to the actual data set of disposable income of urban residents in China,and the dimension of independent variables is reduced to 1.The results show that the regional GDP of each province has a significant contribution to the disposable income of urban residents.The ilr transform is carried out on the component prediction variables after dimension reduction,and the generalized additive model is established,the comparison and analysis of the data prediction results before and after dimension reduction,it is found that the prediction error of the model established by the data after dimension reduction is smaller,which indicates that the dimension reduction method of longitudinal compositional data adequacy proposed in this paper is effective.
Keywords/Search Tags:Longitudinal compositional data, Distance covariance, Sufficient dimension reduction, Structural dimension selection
PDF Full Text Request
Related items