Font Size: a A A

Tests For Mean Vectors In High Dimensional Settings

Posted on:2017-12-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:J G ZhaFull Text:PDF
GTID:1310330566455963Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The emergence of high-dimensional data,such as the gene expression values in microarray and the single nucleotide polymorphism(SNP)data,brings challenges to many traditional statistical methods and theory.Take the hypothesis testing problem for example.When the number of variables is greater than the sample size,many conventional statistics are invalid.On the other hand,the limiting theory of traditional statistics typically assume a large sample size n with respect to the number of variables p,and bring seriously poor results with high dimension.High dimension data analysis is current research focus.In this dissertation,the hypothesis problems on mean vectors are studied in high dimensional settings.To get a valid statistic procedure for high dimension data,one may assume that the variables are independent.See Bickel and Levina(2004)[92]for classification problem in high dimensional setting.Based on this idea,Srivastava and Du(2008)[66]proposed a test for high dimensional data by replacing the sample covariance matrix in Hotelling T2 test statistic by the diagonal of the sample covariance matrix.However,this method did not take adequate account of the correlation among variables.Hence,we proposed a new test statistic which uses some information among variables and the asymptotic distribution of the new test statistic is obtained.Simulation results show that making use of correlations among variables in a proper way may enhance the power of tests.Likelihood ratio test is the most popular method for hypothesis problems in s-tatistics.Since the likelihood ratio test is invalid in high dimensional settings,an important question is how to use the likelihood function to construct test statistics for high dimensional mean vectors.We introduce a generalized likelihood ratio test for high dimensional mean vectors based on union-intersection test method.Further we show that the generalized likelihood ratio test can be viewed as a high dimensional version of Hotelling's test.Under p-asymptotics,we obtain the limiting distribution of the statistics under the null,and the power is also analysed under certain alternatives.We also compute the p value of the new test by using the randomization procedures.Simulation results show that the generalized likelihood ratio test performs very well under most situations.
Keywords/Search Tags:High dimensional data, Hotelling T~2 test, Generalized likelihood ratio, p-asymptotics, Union-Intersection principle
PDF Full Text Request
Related items