Font Size: a A A

Research On Sample Quantile Of Dependent Squences And Related Issues In Regression Models

Posted on:2022-01-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:L PengFull Text:PDF
GTID:1520306836485824Subject:Statistics
Abstract/Summary:PDF Full Text Request
The rapid development of information technology has revolutionized the way data is collected,which has led to the emergence of vast amounts of high-dimensional data in many fields such as finance,economics,medicine and life sciences.Data with time dependence is prevalent,especially when the data comes from the fields of finance and economics.Numerous empirical studies have shown that the distribution of financial time series tends to be heavy-tailed.Quantile,as an important numerical feature describing the location of the data distribution,plays an important role in the statistical inference,especially when the data distribution is heavy-tailed or has infinite mean.In recent years,the growing interest in the use of quantile-based methods in financial time series research has prompted the need for researchers to develop a deeper understanding of the statistical properties of the sample quantile for time series.When the observation samples are from time series,their time-dependent structure is of primary consideration.Markov chain models can describe the dynamics of different system states and are therefore widely used in the study of time-dependent sequences.In addition,weakly dependent mixing sequences are often used in probability theory to capture the dependence between random variables,and such weakly dependent mixing sequences can cover a large class of time series models.In view of this,the aim of this thesis is to consider the convergence rate of the sample quantiles for high-dimensional Markov sequence as well as the φmixing sequence.On the other hand,the Lasso for high-dimensional time series regression has attracted the attention of many researchers in the last two decades.However,when it comes to the regression of economic and financial data,the error terms usually do not satisfy the independent identically distributed assumption.Therefore,the aim of this thesis is to extend the analysis of the classical Lasso method containing i.i.d.errors to highdimensional linear regression models with φ-mixing errors,and to give the convergence rate of the Lasso estimator when the error sequence satisfies the Gaussian distribution and the sub-exponential distribution,respectively.In addition to data with continuous values,count data is also widely available in real life in several fields such as finance,life sciences,clinical trials,criminology and signal processing.With advances in science and experimental techniques,high-dimensional count data is also collected in large quantities in these fields.It is well known that the Poisson regression model is widely used in the regression analysis of count data.However,in many practical applications of high-dimensional count data,the covariates may naturally have some group structure.In view of this,the aim of this thesis is to investigate the parameter estimator of the high-dimensional Poisson regression model with group sparsity in covariates and its theoretical properties.In summary,around the following three sub-topics:the convergence rate of sample quantiles for time-dependent observations,Lasso method for linear regression models with φ-mixing errors,and sparse high-dimensional Poisson regression,the main results of this thesis are summarized as follows:Firstly,this thesis considers the convergence rate of high-dimensional sample quantiles when the observation sequence is a homogeneous Markov sequence and a φ-mixing sequence,respectively.When the observation sequence is a homogeneous Markov sequence,the thesis first gives the accompanying distribution associated with its one-step transition probability,and then defines the quantile and the sample quantile of the homogeneous Markov chain based on this accompanying distribution.Under certain assumptions,taking advantage of the concentration property of the sample quantile and the properties of Markov chain,this thesis gives the concentration inequalities for the high-dimensional sample quantile when the observation sequence is a Markov sequence.When the the observation sequence is a φ-mixing sequence,this thesis gives the convergence rate of its high-dimensional sample quantile utilizing the concentration property of the sample quantile and the φ-mixing sequence.In addition to considering the case where the random variables follow a continuous distribution,the thesis also gives the convergence rates of the high-dimensional sample quantiles for these two time-dependent observations being discrete random variables.Finally,the thesis shows that the convergence rates obtained are faster than those obtained by applying other Hoeffding-type inequalities,both theoretically and numerically.An empirical application of the sample quantile to stock market risk management is also given.Secondly,this thesis explores the consistency theorem for Lasso estimator for the sparse linear regression model containing the exponential φmixing error sequence under the fixed design.When the error sequence is an exponential φ-mixing sequence and the error distribution satisfies the Gaussian distribution as well as the sub-exponential distribution respectively,this thesis gives the non-asymptotic concentration inequalities of the estimation and prediction errors for the Lasso estimator,which also prove the consistency of it.In addition,two separate numerical simulations are given in this thesis.The first simulation result shows that the mean squared error(MSE)of prediction decreases as the sample size increases,and the prediction MSE of the strong signal model is smaller than that of the weak signal model.The second simulation result shows that as the sample size increases,the probability that the Lasso method can correctly select the indicator set of non-zero components for the parameter vector increases.Finally,the results of the empirical analysis in this thesis show that the Index Tracking Portfolio constructed based on Lasso variable selection can closely track the CSI 300 index.Finally,this thesis considers the weighted Group Lasso method applied under random design to estimate parameter vectors with group sparsity in a high-dimensional Poisson regression model,where the number of groups Gn can increase with the sample size n,even as a function of n.Under certain assumptions,this thesis gives non-asymptotic Oracle inequalities for the estimation and prediction errors of the weighted Group Lasso estimator respectively,with the upper bounds on the estimation and prediction errors determined by the regularization coefficient and the sparsity level,with convergence rates of order O(η*(?))and O(η*log(Gn)/n).In addition,based on the concentration inequalities for the weighted sum of independent Poisson random variables,this thesis gives the data-driven weights that allow the KKT conditions for the weighted Group Lasso optimization problem to hold with high probability at the true parameter β*.Finally,this thesis compares the results of weighted Group Lasso estimation with our proposed weight and other weights in numerical simulations and applies these methods to the analysis of the auto insurance claims dataset.
Keywords/Search Tags:sample quantile, convergence rate, Markov sequence, φ-mixing sequence, probability inequality, Lasso method
PDF Full Text Request
Related items