Statistical Diagnosis Of Generalized Linear Reproductive Dispersion Model Based On Pena Distance

Posted on:2021-05-17

Degree:Master

Type:Thesis

Country:China

Candidate:H C Lu

Full Text:PDF

GTID:2480306095991959

Subject:Probability theory and mathematical statistics

Abstract/Summary:

The exponential family distribution is an important type of statistical distribution family in statistics.However,in the real world,there are still many data that cannot be fitted by the exponential family distribution model.In order to meet people’s needs for complex data analysis,statisticians A type of distribution that is wider than the exponential family distribution is proposed,which is called the regenerative divergence model.Among the collected data points,some data points are not obvious in statistical inference.Removing a few data points does not affect the diagnosis result.Some data points may have a greater effect on statistical inference than other data points.It has an impact on the inferred results.Some of the data points’ characteristic characteristics also deviate significantly from other points in the data point set.We usually call them abnormal points or strong impact points.Because of the existence of abnormal points,we need to diagnose and correct abnormal points,so the diagnosis of abnormal points is very important.This paper focuses on the following two contents:Firstly,the Pena distance is used to study the statistical diagnosis problem under the reproductive dispersion model,the expression of the Pena distance under the regenerative divergence model is obtained,and its properties are discussed,so as to obtain the method for discriminating abnormal points with high leverage.In addition,comparing the Pena distance with the Cook distance,it is concluded that the Pena distance is better than the Cook distance under certain conditions.the model and method are is illustrated by simulation studies and a real example analysis.Secondly,for heterogeneous population data,the mixture of regression models is an important tool in statistical data analysis tools.To the mixture data of reproductive dispersion,a mixture of generalized linear reproductive dispersion model is proposed.The EM algorithm is used to estimate the maximum likelihood of the model parameters.The Pena distance and Cook distance are used to study the statistical diagnosis problem.At the same time,the Pena distance and Cook distance are compared.Finally,the data of mixture population and mixture subclustering are compared through imulation studies and a real example analysis.,which further shows that the theory and method are reasonable and effective.

Keywords/Search Tags:

Reproductive Dispersion Model, Mixture of Generalized Linear Reproductive Dispersion Model, Pena distance, EM algorithm, Statistical Diagnostics

Related items

1	Statistical Inference For Reproductive Dispersion Models With Mixture Data
2	Statistical Analysis Of Semiparametric Nonlinear Reproductive Dispersion Model
3	Statistical Inference For Joint Mean And Dispersion Models With Mixture Data
4	Statistical Diagnostics For Skew-normal Data Models Based On The Pena Distance
5	Some Studies On The Semiparametric Generalized Linear Model
6	Statistical Inference Of High-dimensional Nonlinear Reproductive Dispersion Random Effect Model With Missing Data
7	Local Influence In Semi-parametric Nonlinear Reproductive Dispersion Models
8	Statistical Analysis Of Nonlinear Reproductive Dispersion Mixed Models
9	Randomly Censored Partially Linear Model Statistical Diagnostics
10	Statisticcal Analysis Of Dispersion Family Nonlinear Model