October 23rd, 2024

A DSL for peephole transformation rules of integer operations in the PyPy JIT

A new domain-specific language for optimizing integer operations in the PyPy JIT compiler has been developed, featuring declarative transformation rules verified for correctness to enhance efficiency and monitor effectiveness.

Read original article

A DSL for peephole transformation rules of integer operations in the PyPy JIT

The blog post discusses the development of a domain-specific language (DSL) for peephole transformation rules aimed at optimizing integer operations within the PyPy Just-In-Time (JIT) compiler. The motivation for this initiative stems from the need for more efficient integer operations, particularly highlighted by the Pydrofoil project. The new DSL allows for a more declarative and less verbose way to express optimization rules, utilizing pattern matching to simplify sequences of integer operations. Each rule consists of a name, a pattern, and a target transformation, with the ability to include checks and compute intermediate results. The rules are automatically verified for correctness using the Z3 theorem prover during the build process, ensuring that incorrect optimizations are minimized. The post also outlines the structure of the DSL, including examples of simple and complex transformation rules, and discusses the importance of rule coverage and statistics to monitor the effectiveness of the optimizations. Additionally, it highlights the challenges of ensuring termination and confluence in the rewriting process, as well as the necessity of rigorous proof checks to validate the correctness of the rules.

- A new DSL for integer operation optimizations in the PyPy JIT has been developed.

- The DSL allows for more declarative and efficient expression of peephole transformation rules.

- Rules are verified for correctness using the Z3 theorem prover to prevent incorrect optimizations.

- The post emphasizes the importance of monitoring rule coverage and effectiveness through statistics.

- Challenges related to termination and confluence in the rewriting process are acknowledged.

Finding Simple Rewrite Rules for the JIT with Z3

At PLDI conference, CF Bolz-Tereick presented a paper with Max Bernstein on using Z3 for superoptimization. They encoded PyPy's JIT integer operations into Z3 formulas to find and prove simplification rules efficiently.

Mining JIT traces for missing optimizations with Z3

Using Z3, PyPy's JIT traces are analyzed to pinpoint inefficient integer operations for further optimization. By translating operations into Z3 formulas, redundancies are identified to enhance PyPy's JIT compiler efficiently.

A Knownbits Abstract Domain for the Toy Optimizer, Correctly

The article details the Knownbits Abstract Domain's implementation in PyPy's Toy Optimizer, enhancing integer operation optimizations through bit analysis, property-based testing, and improving static analysis for efficient code generation.

PyPy gets a new RISC-V JIT back end

The PyPy team released version 7.3.17, supporting Python 2.7 and 3.10, featuring a new RISC-V JIT backend, enhanced REPL, and improved JIT optimizations for integer operations.

Optimizing Guile Scheme

David Thompson discusses optimizing Guile Scheme, highlighting strategies like minimizing memory allocation, using monomorphic functions, and employing profiling tools to enhance performance in game development and demanding applications.

7 comments

By @fweimer - 7 months

GCC does this match.pd. Here's the add_zero example, I think:

    /* Simplifications of operations with one constant operand and
       simplifications to constants or single values.  */
    
    (for op (plus pointer_plus minus bit_ior bit_xor)
      (simplify
        (op @0 integer_zerop)
        (non_lvalue @0)))

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/match.pd

The floating point case can be more complicated because of signed zeros and signaling NaNs, see fold_real_zero_addition_p.

By @westurner - 7 months

math.fma fused multiply add is in Python 3.13. Are there already rules to transform expressions to math.fma?

And does Z3 verification indicate differences in output due to minimizing float-rounding error?

https://docs.python.org/3/library/math.html#math.fma :

> math.fma(x, y, z)

> Fused multiply-add operation. Return (x * y) + z, computed as though with infinite precision and range followed by a single round to the float format. This operation often provides better accuracy than the direct expression (x * y) + z.

> This function follows the specification of the fusedMultiplyAdd operation described in the IEEE 754 standard.

By @hoten - 7 months

The before/after delta is very satisfying and clearly easier to work with. Very nice.

How does the DSL handle side effects? I'm assuming any term here could be a compound expression, so what part of the system prevents "0*callfn()" from reducing to "0"?

By @JonChesterfield - 7 months

A cost model would give a proof of termination - require the output of each transform to be cheaper than the input. Confluence is less obvious.

By @mjcohen - 7 months

Interesting to see which optimizations are never used.

Finding Simple Rewrite Rules for the JIT with Z3

Mining JIT traces for missing optimizations with Z3

A Knownbits Abstract Domain for the Toy Optimizer, Correctly

PyPy gets a new RISC-V JIT back end

The PyPy team released version 7.3.17, supporting Python 2.7 and 3.10, featuring a new RISC-V JIT backend, enhanced REPL, and improved JIT optimizations for integer operations.

A DSL for peephole transformation rules of integer operations in the PyPy JIT

Related

Finding Simple Rewrite Rules for the JIT with Z3

Mining JIT traces for missing optimizations with Z3

A Knownbits Abstract Domain for the Toy Optimizer, Correctly

PyPy gets a new RISC-V JIT back end

Optimizing Guile Scheme

Related

Finding Simple Rewrite Rules for the JIT with Z3

Mining JIT traces for missing optimizations with Z3

A Knownbits Abstract Domain for the Toy Optimizer, Correctly

PyPy gets a new RISC-V JIT back end

Optimizing Guile Scheme