Font Size: a A A

Estimation And Testing Research On Finite Mixture Distribution Models And Linear Models

Posted on:2009-12-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LuoFull Text:PDF
GTID:1100360245473508Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The task of mathematical statistics,in the final analysis,is that with some random samples and known knowledge,researchers make statistical decisions with respect to their interested issues,using proper statistical methods.In the field of traditional parametric statisties,people are accustomed to assume that the data come from some population of a distribution first,and then make statistical study based on this population.During the past two decades,with science and technology's dramatic developing,the world,faced by people,is more and more complex.So does the sample data,faced by statisticians. Attributed to the complexity of sample data in many cases,a single parametric distribution family can't describe the observed data accurately sometimes.So the idea arises that model the wide random phenomena with finite mixing distribution models.Theory has proved that any finite distribution can be approximated by a finite mixture of Gaussian distributions with the same covariance matrix.This provides theoretical basis for the validity of finite mixing distribution models.As long as select components properly, the models can describe singularly complex distributions well.Especially,when there are local changes among the data,and a single parametric distribution family can't describe the observed data accurately,finite mixing distribution models usually make excellent performance.At the same time,practice has also proved their good applicability.Thus in the past twenty years,finite mixing distribution models have gained rapid and deep development.They're apphed in many fields of society,especially in biology,genetic engineering, psychology,information science,finance and insurance,and so on.Finite mixing distribution models play more and more important role in data analysis.Besides finite mixture model,we also study linear model in the thesis.Linear model is an important branch with earlier development,abundant theory and widespread applications.In the past decades,linear model has been developed and strengthened not only in theory,but also in wide using of economy,finance,health and educational psychology etc.The problem of statistical discission in finite mixing distribution model and linear model is discussed in this thesis,including:Firstly,in the aspect of hypothesis test,we mainly discuss the general homogeneous hypothesis testing problem.That is to say,we want to test for a single normal distribution versus a mixture of two normal distributions.In the past twenty years,the testing issue has been increasingly concerned.On the basis of Chen and Chen(2003),we study the testing problem when removing the equal condition for two structural parameters,imposed in Chen and Chen(2003).The main theoretical challenge in our research,is to obtain the identifiability of the parameters under the null model.To overcome the difficulty, we first introduce a bivariate distribution function G,and generate a measureμon the two dimensional Euclidian Space.Then the general normal mixture distribution can be expressed as a two dimensional Lebesgue-Stieltjes integration respect to G,and the new parameter is identifiable;Second,some large sample behaviors of the MLEs are investigated;At last,the asymptotic distribution of the LRT is considered.It is shown that the asymptotic null distribution of the LRT statistic is the maximum of a x22-variable and the supremum of the square of a truncated Gaussian process with mean 0 and variance 1.Secondly,in the aspect of parameter estimation,we take the mixture transition distribution(MTD) models as the basis models.In the MTD models,the contributions of different lags upon the present are considered separately and combined additively.A difficulty caused by high-order Markov chains is that too many parameters to be estimated in the model.It brings a sea of troubles when people make statistical study.Under the case, the MTD models were introduced by Raftery(1985)for the modeling of time-homogeneous high-order Markov chains which were approximated by MTD models with far fewer parameters than the fully parameterized model.Owing to the additive structure,the MTD models are simple,analytically tractable,and easy to simulate and estimate.Meanwhile, the MTD models are finite mixing distribution models,too.So they appear capable of capturing a wide range of nonstandard behaviors,such as non-Gaussian and nonlinear features in a single unified model class.Thus the class of MTD models has been great generalized and successfully used to a wide range of applications since its first introduction from Raftery(1985).Because of the above reasons,we study parameter estimation in the MTD models.1.In the third chapter,we study parameter estimation in the MTD models based on normal distributions and a generalized extreme-value distribution.The bursts of a series of financial and IT industrial crises,have radically changed the view that extreme events have negligible probability.So when modeling,we construct the mixture of normal distributions and a generalized extreme-value distribution to analyze the tail behaviors of the marginal distributions;In addition,the necessary and sufficient conditions of stationarity are derived.Moreover,under the second-order stationary case,we investigate the relation of first-order autocorrelation function and first-order autocorrelation function;Finally, the MLEs of parameters are calculated by EM algorithm,and the estimating equations are displayed.2.In the fourth chapter,we study parameter estimation in the MTD models based on Weibull distributions.First,owing to the wide range of use of WeibuU distribution,we propose the Weibull MTD model,which has improved the MTD model based on Gaussian distribution in some thick-tailed cases;Second,some stationarity properties of the model are discussed;After that,we apply the standard EM algorithm to mixture models to get the estimation equations and then expose bootstrap method to calculate confidence region of the parameters;Finally,some simulations and an example are analyzed in detail,which show that when analyzing nonstandard date,such as financial date,insurance date and so on,it is more proper for estimation to use the Weibull MTD model than the Gaussian MTD model.Thirdly,in the third chapter and the fourth chapter the MLEs of parameters are calculated by EM algorithm,in the fifth chapter we investigate a accelerated EM algorithm. First,we proposes the acceleration of Monte Carlo EM Algorithm,which is based on Monte Carlo EM Algorithm and Newton-Raphson algorithm,to improve the convergence rate;Second,the it is shown that the accelerated EM algorithm we proposed has quadratic convergence rate in a neighborhood of the posterior mode;Finally,its excellent performance in convergence rate is illustrated by a classical example.Lastly,in the sixth chaptcr we discuss robustness of linear model.In general lincar model,common estimates of estimable function of parameters include Generalized Least Squares Estimate,Gauss-Markov Estimate and Minimum Norm Quadratic Unbiased Estimator of variance,whose robustness in terms of error distributions is investigated in the thesis.The maximal classes of the error distribution are obtained respectively,in which the above estimators maintain their original optimal statistical properties with determinate error distribution.In summary,the problem of statistical discission in finite mixing distribution models and linear model is discussed in this thesis roundly and systematically,including the general homogeneous hypothesis testing problem for normal mixing distribution model,the construct of the mixture transition distribution(MTD)models,the calculation of MLEs and confidence region,the improvement of estimation algorithm,and robustness of parameter estimation of linear model etc.For the general homogeneous hypothesis test,we solve the problem of testing homogeneity in general normal mixture well,and develop the results of Chen and Chen(2003).When constructing finite mixing distribution models, according to reality,we propose two MTD models,one is based on normal distributions and a generalized extreme-value distribution,and the other is based on Weibull distributions. For the estimation algorithm,we improve the results of Louis(1982),by proposing a new accelerated EM algorithm,which facilitates E step by Monte Carlo simulation and also has quadratic convergence rate in a neighborhood of the posterior mode.For robustness of parameter estimation of linear model,we obtain the maximal classes of the error distribution,in which Generalized Least Squares Estimate,Gauss-Markov Estimate of estimable function of parameters,and Minimum Norm Quadratic Unbiased Estimator of variance maintain their optimal statistical properties.These results are not only useful in theory but also significant in practice.
Keywords/Search Tags:mixing distribution model, Markov chain, EM algorithm, homogeneous test, Gaussian process, robustness
PDF Full Text Request
Related items