July 11th, 2024

Boosting Compiler Testing by Injecting Real-World Code

The research introduces a method to enhance compiler testing by using real-world code snippets to create diverse test programs. The approach, implemented in the Creal tool, identified and reported 132 bugs in GCC and LLVM, contributing to compiler testing practices.

Read original articleLink Icon
Boosting Compiler Testing by Injecting Real-World Code

The research article "Boosting Compiler Testing by Injecting Real-World Code" introduces a new method to enhance compiler testing by using code from real-world applications. By combining code snippets from various projects, the approach aims to create well-formed programs that exercise diverse language features. This method involves extracting real-world code at the function level, integrating function calls into seed programs, and utilizing dynamic execution data to maintain semantics and establish complex data dependencies. The approach, implemented in the Creal tool for testing C compilers, successfully identified and reported 132 bugs in GCC and LLVM over a nine-month period. Most of these bugs were confirmed as unknown and critical, with 101 of them already fixed. The study highlights the effectiveness of using real-world code to stress-test compilers, offering a valuable contribution to compiler testing practices and potentially benefiting other compiler testing efforts.

Related

How GCC and Clang handle statically known undefined behaviour

How GCC and Clang handle statically known undefined behaviour

Discussion on compilers handling statically known undefined behavior (UB) in C code reveals insights into optimizations. Compilers like gcc and clang optimize based on undefined language semantics, potentially crashing programs or ignoring problematic code. UB avoidance is crucial for program predictability and security. Compilers differ in handling UB, with gcc and clang showing variations in crash behavior and warnings. LLVM's 'poison' values allow optimizations despite UB, reflecting diverse compiler approaches. Compiler responses to UB are subjective, influenced by developers and user requirements.

Mix-testing: revealing a new class of compiler bugs

Mix-testing: revealing a new class of compiler bugs

A new "mix testing" approach uncovers compiler bugs by compiling test fragments with different compilers. Examples show issues in x86 and Arm architectures, emphasizing the importance of maintaining instruction ordering. Luke Geeson developed a tool to explore compiler combinations, identifying bugs and highlighting the need for clearer guidelines.

Weekend projects: getting silly with C

Weekend projects: getting silly with C

The C programming language's simplicity and expressiveness, despite quirks, influence other languages. Unconventional code structures showcase creativity and flexibility, promoting unique coding practices. Subscription for related content is encouraged.

Code Reviews Do Find Bugs

Code Reviews Do Find Bugs

Code reviews are effective in finding bugs, despite past doubts. Research shows reviewers identify defects, with small code chunks being most efficient. Reviews aid learning, improve maintainability, and promote knowledge sharing.

Refined Input, Degraded Output: The Counterintuitive World of Compiler Behavior

Refined Input, Degraded Output: The Counterintuitive World of Compiler Behavior

The study delves into compiler behavior when given extra information for program optimization. Surprisingly, more data can sometimes lead to poorer optimization due to intricate compiler interactions. Testing identified 59 cases in popular compilers, emphasizing the need for better understanding.

Link Icon 2 comments
By @vlovich123 - 4 months
This is a critically important line of research. Every time a new orthogonal testing approach is explored, it always finds a bunch of latent problems in existing codebases. Great contribution by the authors.
By @pfdietz - 4 months
I did a very simpleminded form of this for testing the compiler in Steel Bank Common Lisp. The COMPILE function in SBCL should never signal an error condition, even on invalid code, so the test just involved splicing together fragments of code from existing Common Lisp code bases (packages in Quicklisp as well as other publicly available CL programs I could find) and trying to cause the compiler to error. It found a significant number of bugs.

This is a lot simpler that what these people were doing, as they wanted to produce code that was actually meaningful and could be run.