Font Size: a A A

The Implementation And Improvement And Application Of Some Regression Algorithms

Posted on:2006-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:S X RuanFull Text:PDF
GTID:2120360155953116Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Multiple linear regression is most popular method in statistics, which used to deal with correlativity. Let y is the dependent variable, x1, x2,..., xm are independent variables, their multiple linear regression model iswhere β0, β1, ..., βm are coefficients, e is model error.In order to get these coefficients, we have n groups of data (yi, xi1, xi2..., xim), (i = 1,2, ..., n). Substituting these data into multiple linear regression model, and we have equations belowThe most usual method of solving these coefficients is Least Square method. It means that we should find proper β0, β1, ..., βm so that the errorreaches its minimum. Let b = β is the estimate of β, so b is the solution of the equations belowX'Xb = X'Y.We can choose xij in order that X'X is a regular matrix, and b is (X'X)-1X'Y.At many occasions, independent variables are collinear. X'X is almost singular, so there is a big error of the estimate of β. Because of orthogonal invariance of 2 norm, problem (3) is equivalent to problem below:Consider QR-decomposition of X. Find a orthogonal matrix Q, so we havewhere R is an upper triangular matrix. Write Q into blocks form, (m+l)xn /^> c p(n—m—l)xnLetc =So problem (4) is equivalent to problem belowmm\ -d I2obviously* only when the estimate of j3 is the solution of the equations Rb = c, error reach its minimum d'd.In a multiple linear regression model, not all independent variables have significant linear relation with dependent variable. How to choose these significant variables is new problem, so we need stepwise regression.At each step of adding variable or deleting variable, we need solve a series of equation. In order to save complexity, we should use the result solved at previous step. We will state the course of stepwise regression based on the QR-decomposition.At some step, suppose that we have added variables x\, x%, ■ ■ ■, xk. We have the QR-decomposition of corresponding matrix-orthogonal matrix Q and upper triangularWe also have other variables xk+v ? ? ?, Xm , y^-Now give the the course of adding a valuable. Consider the variable Zjt-i-i, let v is a vector of the last n-k components of Xk+\, letuk+1 = (v - pk+iei)/\\v - Pk+ieih,where pk+i = — siLet Hk+i is a Householder matrix, — I - 2uk+\u'k+vLet = y{k) -Denote Sfc+ithe No k+1 element of y^. Similarly, we have Sj, j = k + 2,? ? ? ,m. Compare Sj, suppose s^+1is biggest* so,2(SResidual 4+l)/(n k 2)According to the critical value Fadd- if flfc+i > Fadd> then add variableo c .2^residual — '-'residual 5fc+lXj -Then goto deleting variables step, if Ffc+1 < Fadd, it mean that there are no more variable can be added, so stop selecting variables.Next we will give the course of deleting variables. Consider Xj (j = l,2,---,fc-l)1letGJ-Ii(i=; + l,---,fe)1 \1c s—s c1\ l Jwhere Ji,i-lletDenote Sj is the No. k component of y^ (j = 1, ■ ? ?, k — 1). Compare s^, suppose Sj is smallest. LetSreaidual/(n -k-l)'According to critical value F^eiete ? '& F < Fdeute ' then delete x\ > let^residual ^residual ? $1then go on deleting variables until no more variables can be deleted, if F > Fdtdete> ^ mean that there no more variables can be deleted, then goto step of adding variables.For the application of multiple linear regression, we have to introduce Partial Least Square. It also a algorithm of extracting a component form data table. Suppose there are two data table X and Y, extract t form X, u form Y. There are two conditions t and u must satisfy:1. t and u can extract as most variance as they can.2. relativity between t and u must reach its maximum. More details about this algorithm, please see [3].We give a scheme of solving CSI(Customer Satisfaction Index) problem by multiple linear regression and PLS(Partial Least Square). Now we give a brief introduction of CSI.Since the 80s of the 20th century, a new managing theory had developed rapidly in the developed country, which is oriented to the customers, tried to suit the demands and expectation of the customers, pursued the satisfaction customers and loyalty of customers. In 1989, professor Fornell of the Quality Research Center of Michigan University of U.S., advanced the ACSI(America Customer Satisfaction Index) model. He formed a econometric equations with six factors: customers expectations, customers' perceive of quality, customers' perceive of value, customer satisfaction, customer loyalty and customer voice, which was called Fornell model or ACSI model. Up to now, Fornell model is most widely-used CSI model.
Keywords/Search Tags:Implementation
PDF Full Text Request
Related items