Font Size: a A A

Practical Formal Techniques and Tools for Developing LLVM's Peephole Optimization

Posted on:2019-04-08Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Menendez, DavidFull Text:PDF
GTID:1478390017486963Subject:Computer Science
Abstract/Summary:
Modern compilers perform extensive transformation of code in order to optimize running time and binary code size. These occur in multiple passes, including translations between representations at different levels of abstraction and transformations which restructure code within a particular representation. Of particular interest are optimizations that operate on a compiler's intermediate representation (IR), as these can be shared across programming languages and hardware architectures. One such optimization pass is LLVM's peephole optimizer, which is a suite of hundreds of small algebraic transformations which simplify code and perform canonicalization. Performing these transformations not only results in faster software, but simplifies other optimization passes by reducing the number of equivalent forms they must consider.;It is essential that these optimizations preserve the semantics of input programs. Even a small transformation which changes the value computed by a code fragment or introduces undefined behavior can result in executable programs with incorrect or unpredictable behavior. Optimizations, and analysis of optimizations, must be particularly careful when treating undefined behavior, as modern compilers increasingly use the knowledge that certain operations are undefined in order to streamline or eliminate code---occasionally in ways that are surprising to compiler users. Unfortunately, compiler developers can also overlook undefined behavior or fail to consider rare edge cases, resulting in incorrect transformations. In particular, LLVM's peephole optimizer has historically been one of the buggier parts of LLVM.;To aid the development of correct peephole transformations in LLVM, we introduce Alive, a domain-specific language for specifying such transformations. Selecting a small yet expressive subset of LLVM IR allows for automated verification of Alive transformations, and the Alive toolkit can generate an implementation of a correct transformation suitable for inclusion in LLVM. The correctness checks for Alive consider the various forms of undefined behavior defined by LLVM and ensure that transformations do not change the meaning of a program. Alive specifications can include a mixture of integer and floating-point operations, and transformations can be generalized over different types.;Some transformations require a precondition in order to be correct. These preconditions may be simple, but occasionally it is challenging to find a precondition that is sufficiently strong while remaining widely applicable. To assist in this process, the Alive toolkit includes Alive-Infer, a data-driven method for synthesizing preconditions. Depending on the complexity of the transformation, the weakest precondition sufficient to make a transformation correct may not be desirable, so Alive-Infer can provide a choice of concise but stronger preconditions. The Alive-Infer method automatically finds positive and negative examples to guide inference and finds useful predicates through enumeration.;Finally, specifying transformations in Alive enables analyses of multiple transformations and their interaction. It is possible to have transformations or sequences of transformations which can be applied indefinitely to a finite input. This dissertation presents a method for testing whether such a sequence can be applied indefinitely to some input.;Alive demonstrates that a properly chosen abstraction can provide the benefits of formal code verification without the need for manually written proofs, and can enable new techniques and analyses to assist development.
Keywords/Search Tags:LLVM, Code, Llvm's peephole, Transformations, Undefined behavior, Alive
Related items