Font Size: a A A

Semiparametric Regression Analysis Of A Class Of Constrained Zero Inflated Counting Model

Posted on:2020-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:L L XiaFull Text:PDF
GTID:2370330575952120Subject:Statistics Mathematical Statistics
Abstract/Summary:PDF Full Text Request
In scientific research,zero inflated data is often encountered.The commonality of this type of data is that the variance of the observed data changes more than the mean value.We usually call this the problem of super-diffusion.At present,the research on such data has been very comprehensive.For example,for zero inflated data,many scholars have studied the zero-inflated Poisson regression model,the zero-inflated negative binomial regression model,and the non-parametric zero-inflated Poisson regression model.These are only the study of ordinary zeroinflated count data.When faced with some complicated data,for example,sometimes we don't know whether a certain covariate affects the zeroinflated part or the distribution mean value.Or it has an impact on both.At present,our solution is to combine the actual and then through the simulation of one pass,which is quite cumbersome and inaccurate.It is not difficult to study the effect of a certain covariate on the zero inflated rate or the mean value of the distribution.However,when a certain covariate affects both parts and there is a certain correlation,the ordinary zero is used at this time.Studying the expansion count model would be impractical and would result in large errors.In the face of the above problem,this paper studies the effect of a certain covariate on both,assuming a linear relationship between the nonzero expansion rate and a certain distribution mean,such as the price of a product affects The average daily sales volume,which affects the probability of proportional purchase,and we know that there is a linear relationship between the two,we can use the constrained zero inflated counting model for analysis,the constraint is non-zero inflated rate and some distribution.There is a linear relationship between the means.Although the model with the constraint condition introduces new parameters,compared with the unconstrained case,the model is more concise and the number of parameters is less.At present,there is no systematic research on this type of data in China.The expansion is mainly performed on the zero-inflated count data with constraints.In the connection function part of the model,some scholars have carried out nonparametric regression analysis.Although the nonparametric regression model overcomes the defect of the subjective hypothesis function form of parameter regression,most of the existing estimation methods of this model are local estimates.There must be enough data points to get a more accurate estimation result.At the same time,in order to achieve such a condition,it will face the "dimensionality bane",and the compromise is to use a semi-parametric regression model.In this paper,a partial linear additivity model is used in the connection function to perform regression analysis on the constrained zero-inflated counting model.The non-parametric estimation part uses the smooth spline estimation method.The basic idea of smooth spline regression based on B-spline is to add the sum of the second-order difference of the spline base function coefficients to the objective function as a penalty,mainly to overcome the over-fitting of the data points by the curve.In addition,in the parameter estimation process,it is found to be quite difficult if the log likelihood function is directly estimated,because there is a high calculus,and it is difficult to find an analytical solution.But in the case of missing data,zero in the zero-inflated count data,we don't know whether it is from distribution zero or structure zero,so we introduce the EM algorithm and combine the nonlinear estimator with the penalty likelihood estimate.A penalty log-likelihood function based on complete data is obtained.Then the parameters and non-parametric parts of the model are estimated by EM algorithm and Newton-Raphson algorithm,and the proof of the properties of the relevant large samples is proved.In order to make the research problem more reflective of the characteristics of the data,this paper also analyzes the zero-inflated counting model with constraints under different discrete distributions(Poisson distribution,generalized Poisson distribution,binomial distribution),which can be modeled according to different data characteristics.select.The effectiveness of this method can be verified by Monte Carlo simulaxtion and case analysis.
Keywords/Search Tags:Constrained zero inflated count model, Partial linear addition, Semiparametric regression, Large sample property, EM algorithm, Monte Carlo simulation
PDF Full Text Request
Related items