| With the continuous development of artificial intelligence,big data,and cloud computing,applications,smart sensors and Io T devices are generating data all the time.The scale of data has grown dramatically,data processing is facing a huge challenge.Traditional methods are no longer applicable when it comes to massive data.The way of data processing has ushered in a revolutionary change.Distributed statistical computing is an effective method,which is more suitable for processing massive data.This thesis combines distributed method with composite quantile regression and studies distributed composite quantile regression estimation for high-dimensional linear regression model.Composite quantile regression is a robust and efficient estimation method.It is hard to solve because the loss function of composite quantile regression is nonsmooth.Computationally,the efficiency of the proposed method is greatly improved.At each round the proposed method only requires the master machine to solve a shifted L1 regularized least square estimation problem.Because a new connection between composite quantile regression loss function and squared loss function is established.Furthermore,the communication-efficient surrogate likelihood procedure is used in the thesis,which reduce the cost of communications by transmitting the information of the master machine.Theoretically,the proposed estimator achieves a nearoracle convergence rate without any restriction on the number of machines.This thesis uses Monte Carlo stochastic simulation and Beijing air quality data to evaluate the proposed method by contrasting the fitting accuracy,feature selection ability and the time cost of program under different number of machines.The result shows that the method proposed in this thesis can significantly improve the fitting accuracy compared with other methods while ensuring the low time cost of the program.When data does not obey the normal distribution,the proposed method can still guarantee high fitting accuracy and feature selection ability. |