Description: Code Optimization Techniques for Lazy Functional Languages
This thesis describes a complete compiler back-end for lazy functional languages, which uses various interprocedural optimisations to produce highly optimised code. The most important contributions of this work are the following. A novel intermediate language, called GRIN (Graph Reduction Intermediate Notation), around which the first part of the back-end is built. The GRIN language has a very ``functional flavour'', making it well suited for analysis and program transformation, but at the same time provides the ``low level'' machinery needed to express many concrete implementation details. We apply a program-wide control flow analysis, also called a heap points-to analysis, to the GRIN program. The result of the analysis is used to eliminate unknown control flow in the program, \ie function calls where the call target is unknown at compile time (due to forcing of closures). We present a large number of GRIN source-to-source program transformations that are applied to the program. Most transformations are very simple but when taken together, and applied repeatedly, produce greatly simplified and optimised code. Several non-standard transformations are made possible by the low level control offered by the GRIN language (when compared to a more high level intermediate language, eg the STG language). Eventually, the GRIN code is translated into RISC machine code. We develop an interprocedural register allocation algorithm, with a main purpose of decreasing the function call and return overhead. The register allocation algorithm includes a new technique to optimise the placement of register save and restore instructions, related to Chow's shrink-wrapping, and extends traditional Chaitin-style graph colouring with interprocedural coalescing and a restricted form of live range splitting. The algorithm produces a tailor-made calling convention for each function (the registers used to pass function arguments and results). A combined compile time and runtime approach is used to support garbage collection in the presence of aggressive optimisations (most notably our register allocation), without imposing any mutator overhead. The method includes a runtime call stack traversal and interpretation of registers and stack frames using pre-computed descriptor tables. Experiments sofar have been very promising. For a set of small to medium-sized Haskell programs taken from the nofib benchmark suite, the code produced by our back-end executes several times faster than the code produced by some other compilers (ghc and hbc).