Measuring empirical computational complexity

Posted on:2010-08-17

Degree:Ph.D

Type:Dissertation

University:University of California, Berkeley

Candidate:Goldsmith, Simon Fredrick

Full Text:PDF

GTID:1448390002475726

Subject:Computer Science

Abstract/Summary:

Scalability is a fundamental problem in computer science. Computer scientists often describe the scalability of algorithms in the language of theoretical computational complexity, bounding the number of operations an algorithm performs as a function of the size of its input. The main contribution of this dissertation is to provide an analogous description of the scalability of actual software implementations run on realistic workloads.;We propose a method for describing the asymptotic behavior of programs in practice by measuring their empirical computational complexity. Our method involves running a program on workloads spanning several orders of magnitude in size, measuring their performance, and fitting these observations to a model that predicts performance as a function of workload size. Comparing these models to the programmer's expectations or to theoretical asymptotic bounds can reveal performance bugs or confirm that a program's performance scales as expected.;We develop our methodology for constructing these models of empirical complexity as we describe and evaluate two techniques. Our first technique, BB-TRENDPROF, constructs models that predict how many times each basic block runs as a linear (y = a + bx) or a powerlaw (y = axb) function of user-specified features of the program's workloads. To present output succinctly and focus attention on scalability-critical code, BB-TRENDP ROF groups and ranks program locations based on these models. We demonstrate the power of BB-TrendProf compared to existing tools by running it on several large programs and reporting cases where its models show (1) an implementation of a complex algorithm scaling as expected, (2) two complex algorithms beating their worst-case theoretical complexity bounds when run on realistic inputs, and (3) a performance bug.;Our second technique, CF-TRENDPROF, models performance of loops and functions both per-function-invocation and per-workload. It improves upon the precision of BB-TRENDPROF's models by using control flow to generate candidates from a richer family of models and a novel model selection criteria to select among these candidates. We show that CF-TRENDPROF's improvements to model generation and selection allow it to correctly characterize or closely approximate the empirical scalability of several well-known algorithms and data structures and to diagnose several synthetic, but realistic, scalability problems without observing an egregiously expensive workload. We also show that CF-TRENDPROF deals with multiple workload features better than BB-TRENDPROF. We qualitatively compare the output of BB-TRENDPROF and CF-T RENDPROF and discuss their relative strengths and weaknesses.

Keywords/Search Tags:

BB-TRENDPROF, Complexity, Empirical, Measuring, Computational, Models, Scalability

Related items

1	On The Models And Theory Of Information Retrieval Based On Kolmogorov Complexity
2	The Research On The Models Of Non-classical Computation And Their Computational Complexity
3	Parallel computational complexity in statistical physics
4	The Computational Complexity of Randomness
5	New methodology for measuring information, knowledge, and understanding versus complexity in hierarchical decision support models
6	Complexity Scalable Algorithm Of Inter Modes Decision In H.264/AVC
7	An integrated complexity analysis of problems from computational biology
8	Limitations of non-uniform computational models
9	Cluster algorithms and computational complexity
10	Complexity And Approximation Algorithm Of Partitioning Colored Point Set Into Monochromatic Parts With Straight Line