Font Size: a A A

Sparse Matrix Matrix-Vector Multiplication And Auto-Tuning

Posted on:2012-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhuangFull Text:PDF
GTID:2248330395962357Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the field of numerical analysis, a sparse matrix is a matrix populated primarily with zeros. Sparse matrix multiplication is widely involved in scientific researches and large numbers of applications. Large scale sparse matrix plays an important role in the discrete solution of many physical processes, such as represents the interaction between elements in finite element analysis, represents graph in graph theory, solves the partial differential equation in fluid mechanics, and so on. At present, for the studies of sparse matrix has infiltrated into many areas, such as structural analysis, network theory, power distribution systems, chemical engineering, photogrammetry and other aspects of science learning and management studies. The study of sparse matrix vector multiplication helps to improve the research in there related fields and solve the problem of computing efficiency. Therefore, this study is meaningful.In this paper, series optimization methods of SpMV have been approached. And the paper presents a method SCS which is based on principal component analysis (PCA) and does an auto-tuning work with optimization techniques. Then the paper presents and develops a math library COSC which integration of existing optimization techniques and using SCS as its auto-tuning algorithm. On the basis of the above work, this paper puts forward innovative the quadtree storage which uses recursive and rearrangement technique. Thanks to the design of the quadtree, this storage ensures that it gets a higher cache hit rate in SpMV, so as to enhance operational overall performance. Further, this paper gives the analysis based on quadtree based sparse matrix vector multiplication performance and the principles of optimization.The main work of this paper can be summarized as follows:(1) With the study of the existing optimization techniques, this paper focuses on the aspect of computer architecture, summaries and presents four kinds of optimization strategy.(2) In order to get the corresponding combination strategy for different non-zeros distribution, this paper presents an auto-tuning method called Strategy for Combination Strategy (SCS) which bases on Principal Component Analysis (PCA) and goals to find out the approximate optimal solution of combination strategy with the linear time complexity. Experiments show the SCS method is valid. It can be settled in the SpMV library such as COSC to get the approximate optimal combination strategy. And the mathematic kernel library COSC can also be settled in SpMV and improve the efficiency of hot spot.(3) This paper proposes a quadtree storage format, derived from recursive divisions of matrix. With this data structure, the original sparse matrix multiplication can be transferred to computations over independent relatively small regions, which can well fit the cache capacity. Therefore, a significant improvement of performance could be achieved. Moreover, this region-based computation framework promises substantial parallelism in distributed computing environment. We elaborate our solution of quadtree based parallel matrix multiplication, including tuning a better region size, instruction level optimization techniques and computation distribution strategy. Extra cost introduced by this data model is discussed as well. The quadtree storage format achieves high performance after employing instruction level optimization techniques, and the experiments in Lenovo Deepcomp7000demonstrate that the sparse matrix-vector multiplication has an average of1.63speedup comparing with the Intel Cluster Math Kernel Library implementation.(4) With the summary, this paper shows the optimization direction of this paper and presents the future work.
Keywords/Search Tags:SpMV, Quadtree, Parallellism, PCA, Auto-tuning
PDF Full Text Request
Related items