Font Size: a A A

The Numerical Methods For The Maximal Correlation Problem

Posted on:2012-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:X W QinFull Text:PDF
GTID:2210330338464700Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Canonical Correlation Analysis (CCA) is a multivariate statistical method to research the relationship between two sets of variables. It can reveal the internal linear relationship between them. The purpose of CCA is to identify and quantify the relationship between two sets of variables, and transform the analysis of this relationship to the analysis of another relationship between the two linear combinations of these two sets of variables. And then the correlative relationship of the two integrated variables can be used to reflect the whale correlation of the two sets of variables. CCA has been widely used in many fields, such as biology,psychology, marketing, industrial production and information technology, etc., therefore, the research value of CCA is quite significant. CCA was firstly proposed by Hotelling. And his basic concept can be described as follows:Firstly find out a linear combination within each set of variables, making the biggest correlation coefficient appear among the variables that are grouped in each linear combination. Secondly, pick up the second pair of linear combinations, the correlation coefficient of which is smaller than the first one but bigger than any other. Besides, the variables are required to be independent and not correlative to those in the first pair of linear combinations. And then continue these steps until all the correlations between the two sets of variables are found. The selected pairs of linear combinations are called canonical variables, and their respective correlation coefficient is called canonical correlation coefficient.In order to study the canonical correlation problem among sets of variables, Van de Geer put forward the Maxbet method, and applied his method to detect the maximal correlation among sets of variables. The Maxbet method is to find out the maximal value of the function,when f(u)=utxtxu uitui=1,i=1,2,,...,m and it is called the maximal correlation problem (MCP). By applying the theory of Lagrange multipliers, the multivariate eigenvalue problem (MEP) is then brought out.In practical application, the global maximum of the MCP should be used, namely the maximal correlation solution. However, according to the existing theory research and numerical tests, in the end what we get for the solution of the MEP is usually the local maximum of the MCP, not the global maximum of the MCP that actually is wanted. Thus, it is unavoidably restrict the practical use of the Maxbet method. For the purpose of obtaining the global maximum of the MCP in a better way, this thesis focuses on two algorithms and one starting point strategyFirst of all, when it comes to the algorithms, in order to obtain the global maximum of the MCP in a better way, this paper completes the convergence theory of the P-SOR method. Since Numerical tests show that the P-SOR algorithm is sensitive to the choice of the relaxation parameter co,this paper proposed the P-SSOR algorithm (Part 2, Algorithm 2-2), which is comparatively less sensitive to the choice of(?).A number of numerical tests have been done to verify the superiority of the P-SSOR algorithm.Although we cannot find out a certain algorithm that can guarantee us to obtain the global maximum of the MCP, applying the character of the global maximum of the MCP and the existing conclusions, a more effective method of obtaining the global maximum of the MCP is given (Algorithm 2-3). Apply the Algorithm 2-3 to the solution of MEP,((?,m)) ,the global maximum,((?,m)) of the MCP is the result that can be always achieved. And this point has been proved by a great many of examples. Thus, it is sufficient to indicate that the Algorithm 2-3 is indeed effective The existing conclusions indicate that the choice of the starting point closely relates to two aspects: whether the global maximum of the MCP can be obtained and how the convergence speed will be. If the chosen starting point is inappropriate, it probably leads to a local maximum and the convergence speed would be very low as well. In the third part of this thesis, a new starting point strategy (Part 3, Strategy 3) is given based on the Maxvar solution. A large duantity of numerical tests indicate that by applying the chosen starting point which is based on Strategy 3, the global maximum can be obtained with a high probability, furthermore, the iteration steps can be reduced significantly.
Keywords/Search Tags:Canonical correlation analysis, Maximal correlation problem, Multivariate eigenvalue problem, P-SOR method, The relaxation parameter, Starting point strategy
PDF Full Text Request
Related items