Font Size: a A A

Estimating And Testing In Mixture Problem

Posted on:2003-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z PangFull Text:PDF
GTID:2120360062486181Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Many statistical models in practical world involve mixture problems. For example , geneticists want to determine whether their data are from a population or from the population composed of some homogeneous subpopulations resulting from a mutation of a genetic trait ( see e.g , Schork, Allison and Thiel, 1996 ). More background information can be found in the books by Titterington, Makov and Smith (1985).This paper considers the following problem : Assume Xi,..., Xn are random samples from a mixture population F1(x) + (l ?) F2(x), where [0, 1], F1(x), F2(x) are two known continuous probability distribution functions, they may come from different distribution families. We want to testMany authors try to use the customary likelihood ratio test (LRT) for this hypothesis testing problems. With some standard regularity conditions, the classic result of Wilks (1938) guarantees the LRT statistic has asymptotically a x2-distribution under null hypothesis. However, the regularity conditions do not hold here for the null hypothesis lies on the boundary of the parameter space (=0), whereas the regularity conditions need it to be in the interior. Furthermore, some authors assume FI(X), F2(x) belong to the same distribution family with only the difference between mean parameter 9 . In this scenario, their null hypothesis becomes : HO : =0 or 1= 2, the two statements a=0 and 1=2 are not exclusive, thus, the model lacks identifiability. Due to the invariance property of LRT under transformation, some authors try to eliminate the unidentifiability by repa-rameterization (Chernoff and Lander,1995; Dacunha-Castelle and Gassiat,1997; Lemdani and Pons, 1999). However, the Fisher information which determines the large sample behavior of the maximum likelihood estimators (MLE) degenerates. Thus, the MLE for some model parameters are inconsistent. Cramer (1946)'s result about the asymptotic normality of the MLE also fails to apply to mixture problems.After imposing a bounded assumption on the mean parameter, Ghosh andSen (1985) gave the first asymptotic distribution of the LRT. However, they need another separation condition, i.e | |>, for some given >0. Chen and Chen (1998 c) developed a sandwich method which removed the separation condition. They showed that in normal mixture problems, the asymptotic null distribution of LRT statistic is the maximum of a x2-variable and supremum of the square of a truncated Gaussian process with mean 0 and variance 1. Their result also showed that the mixing distribution with a structure parameter has a convergence rate n-1/8 and the rate for the mixing distribution without a structure parameter is n-1/4 Due to the complexion of truncated Gaussian process, it's appealing to develop simpler methods. McLachlan (1987) and Schork (1992) used a simulation-based test. It's also not very satisfying.Yang (1993) proposed the idea of artificial parameters. After introducing two artificial parameters, he analyzed a simple linear regression model instead of the P.P. plot. In this way, a goodness-of-fit test is parameterized. Based on Yang's idea, we discard the classic LRT method and propose three estimators a of the proportional parameter a. From the large sample study, our estimators are all consistent and asymptotically normal distributed. Their convergence rates are n-1/2. Then we propose a chi-square test to test homogeneity. Our results can also be applied to a variety of other interesting statistical problems, such as the change point problems to test the change point.
Keywords/Search Tags:Asymptotic distribution, Artificial parameters, Mixture problem
PDF Full Text Request
Related items