Font Size: a A A

Automatic software performance optimization on modern architectures

Posted on:2008-07-15Degree:Ph.DType:Dissertation
University:University of Illinois at Urbana-ChampaignCandidate:Jiang, ChanghaoFull Text:PDF
GTID:1458390005481012Subject:Computer Science
Abstract/Summary:
This dissertation extends automatic library generation methodology to an emerging untraditional computer architectures and to a complex application domain. Specifically, it consists of two parts of work: First, it implements an automatic matrix multiply library generator for graphics hardware---a specialized architecture with enormous computing power for graphics applications. Second, it uses machine learning techniques to automatically select the best algorithm for frequent pattern mining problems according to input characteristics.; The automatic matrix multiplication tuning system uses a parameterized code generator to generate multiple versions of matrix multiplication, whose performances are empirically evaluated by actual execution on the target platform. An ad-hoc search engine is employed to search over the implementation space for the version that yields the best performance. In contrast to similar systems on CPUs, which utilize cache blocking, register tiling, instruction scheduling tuning strategies, it identifies and exploits several tuning strategies that are unique for graphics hardware. These tuning strategies include optimizing for multiple-render-targets, SIMD instructions with data packing, overcoming limitations on instruction count and dynamic branch instruction. The generated implementations have comparable performance with expert manually tuned version in spite of the significant overhead incurred due to the use of the high-level BrookGPU language.; Frequent pattern mining is a fundamental problem in data mining and a large number of distinct algorithms have been proposed to solve it efficiently. However, no single algorithm outperforms all the others since their relative performance highly depends on the characteristics of the input data. In the dissertation, we present a machine learning based approach to select the best frequent pattern mining algorithm based on the input characteristics. Three of the fastest publicly available algorithms, FP_Growth, LCM and Eclat, were extensively evaluated using synthetic data sets. The results of these evaluations were used to train a support-vector machine (SVM) prediction system, which is then used at runtime to predict the best mining algorithm for real-world data sets. Our experiments show that the runtime prediction overhead is negligible and that the trained SVM prediction system usually identifies the best algorithm. In case of misprediction, the selected algorithm is still competitive in performance.
Keywords/Search Tags:Performance, Automatic, Algorithm, Frequent pattern mining
Related items