Font Size: a A A

Measuring empirical computational complexity

Posted on:2010-08-17Degree:Ph.DType:Dissertation
University:University of California, BerkeleyCandidate:Goldsmith, Simon FredrickFull Text:PDF
GTID:1448390002475726Subject:Computer Science
Abstract/Summary:
Scalability is a fundamental problem in computer science. Computer scientists often describe the scalability of algorithms in the language of theoretical computational complexity, bounding the number of operations an algorithm performs as a function of the size of its input. The main contribution of this dissertation is to provide an analogous description of the scalability of actual software implementations run on realistic workloads.;We propose a method for describing the asymptotic behavior of programs in practice by measuring their empirical computational complexity. Our method involves running a program on workloads spanning several orders of magnitude in size, measuring their performance, and fitting these observations to a model that predicts performance as a function of workload size. Comparing these models to the programmer's expectations or to theoretical asymptotic bounds can reveal performance bugs or confirm that a program's performance scales as expected.;We develop our methodology for constructing these models of empirical complexity as we describe and evaluate two techniques. Our first technique, BB-TRENDPROF, constructs models that predict how many times each basic block runs as a linear (y = a + bx) or a powerlaw (y = axb) function of user-specified features of the program's workloads. To present output succinctly and focus attention on scalability-critical code, BB-TRENDP ROF groups and ranks program locations based on these models. We demonstrate the power of BB-TrendProf compared to existing tools by running it on several large programs and reporting cases where its models show (1) an implementation of a complex algorithm scaling as expected, (2) two complex algorithms beating their worst-case theoretical complexity bounds when run on realistic inputs, and (3) a performance bug.;Our second technique, CF-TRENDPROF, models performance of loops and functions both per-function-invocation and per-workload. It improves upon the precision of BB-TRENDPROF's models by using control flow to generate candidates from a richer family of models and a novel model selection criteria to select among these candidates. We show that CF-TRENDPROF's improvements to model generation and selection allow it to correctly characterize or closely approximate the empirical scalability of several well-known algorithms and data structures and to diagnose several synthetic, but realistic, scalability problems without observing an egregiously expensive workload. We also show that CF-TRENDPROF deals with multiple workload features better than BB-TRENDPROF. We qualitatively compare the output of BB-TRENDPROF and CF-T RENDPROF and discuss their relative strengths and weaknesses.
Keywords/Search Tags:BB-TRENDPROF, Complexity, Empirical, Measuring, Computational, Models, Scalability
Related items