How does it feel to test a compiler?
Alexander Zakharenko discusses the unique challenges of compiler testing, emphasizing the importance of automated and exploratory tests, collaboration with developers, and the intricacies of Kotlin/Native's compilation process.
Read original articleAlexander Zakharenko, a QA engineer on the Kotlin/Native team, shares insights into the unique experience of testing a compiler. With a background in software engineering and extensive experience in backend automation testing, Zakharenko transitioned to compiler testing after joining JetBrains. He explains that a compiler translates programming languages into machine code and consists of a frontend for analysis and a backend for code generation. Kotlin/Native allows Kotlin code to compile into native binaries, suitable for platforms without a virtual machine. Unlike typical software testing, compiler testing lacks a graphical or network interface, focusing instead on various language constructs, linking libraries, and compilation parameters. Zakharenko emphasizes the importance of automated tests, including unit, integration, and performance tests, alongside exploratory testing for complex features. He describes his workflow, which involves reviewing tasks, conducting exploratory tests, and collaborating with developers. Zakharenko provides examples of tasks he has tackled, such as implementing annotations to hide symbols in Objective-C and testing compiler features. He also highlights the challenges of ensuring compatibility with different operating systems and build systems. Overall, Zakharenko's article illustrates the intricate and rewarding nature of compiler testing, showcasing the blend of technical skills and problem-solving required in this niche field.
- Compiler testing is distinct from typical software testing due to the absence of graphical and network interfaces.
- Kotlin/Native compiles Kotlin code into native binaries, making it suitable for platforms without virtual machines.
- Automated tests play a crucial role in compiler testing, supplemented by exploratory testing for complex features.
- The testing process involves collaboration with developers and a thorough understanding of the compilation process.
- Zakharenko's experience highlights the technical challenges and rewards of working in compiler testing.
Related
Mix-testing: revealing a new class of compiler bugs
A new "mix testing" approach uncovers compiler bugs by compiling test fragments with different compilers. Examples show issues in x86 and Arm architectures, emphasizing the importance of maintaining instruction ordering. Luke Geeson developed a tool to explore compiler combinations, identifying bugs and highlighting the need for clearer guidelines.
Boosting Compiler Testing by Injecting Real-World Code
The research introduces a method to enhance compiler testing by using real-world code snippets to create diverse test programs. The approach, implemented in the Creal tool, identified and reported 132 bugs in GCC and LLVM, contributing to compiler testing practices.
Driving Compilers
The article outlines the author's journey learning C and C++, focusing on the compilation process often overlooked in programming literature. It introduces a series to clarify executable creation in a Linux environment.
How to Compile Your Language – Guide to implement a modern compiler for language
This guide introduces programming language design and modern compiler implementation, emphasizing language purpose, syntax familiarity, and compiler components, while focusing on frontend development using LLVM, with source code available on GitHub.
Clang vs. Clang
The blog post critiques compiler optimizations in Clang, arguing they often introduce bugs and security vulnerabilities, diminish performance gains, and create timing channels, urging a reevaluation of current practices.
- Many commenters agree that compiler testing can be complex due to the Oracle Problem, where verifying the correctness of output is non-trivial.
- Several users highlight the importance of automated testing methods, such as differential testing and fuzzing, to uncover bugs effectively.
- There is a consensus on the value of collaboration between developers and testers, with some praising Jetbrains for treating their testing team as equals.
- Some commenters share personal experiences with compiler projects, noting the challenges they faced and the lessons learned.
- Criticism is directed at certain design choices in languages like Kotlin, particularly regarding error handling and testing practices.
E.g. if I want to test that `*` has higher precedence than `+`, I would write something like this:
assert_ast_equals(parse_expr("(a*b)+c"), parse_expr("a*b+c"))
assert_ast_equals(parse_expr("a+(b*c)"), parse_expr("a+b*c"))
You can rewrite the whole compiler if you want, but as long as you have some notion of a "parser", an "AST" and "two AST nodes being the same" this test will keep working.This is much more powerful than going into the parser internals and comparing get_operator_precedence('+') with get_operator_precedence('*') which is the default thing you would do if you're told to test every function after writing it.
Lots of places have a two tier system where the real developers write the code and those who don't make the cut test the code, with pay delta and an aspiration of being promoted out of testing.
Other places have a mandatory stint in testing for new developments as a way to get some headcount on the task.
Jetbrains don't do that. Or at least they didn't sometime before covid when I met a bunch of their test devs at a conference. The developers mostly doing testing were equals to those mostly doing product work. Possibly with a more extreme bias towards case analysis.
I don't think it's a coincidence that jetbrains treat their test team as peers to the others and that their software seems to mostly not fall over in the field.
There are a couple effective techniques in the literature that might be useful here:
- Differential testing[1]: generate a bunch of random, correct, deterministic programs; run them under different compilers or under different compilation flags and check if the output of the program is identical
- Equivalence Modulo Inputs[2]: a class of techniques that can be used transform a program to a distinct program that is supposed to be equivalent to the original for a specific input. (shameless plug)
[1]: https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf
$ valgrind --leak-check=full --track-origins=yes ./lisp < test/test.lisp
$ cat test/expected.txt
What can I say, works for me.But as with any code, compilers have bugs too and sometimes they can be quite surprising.
[1] https://www.coinfabrik.com/blog/why-the-fuzz-about-fuzzing-c...
It also disturbs me that the author mentioned the order at which sources are compiled to matter in the final result. It should never matter.
When we build software we should always make it in such way it’s trivial to write tests for it. If writing a test is easy, it indicates using the tool you wrote is also easy.
1. Functional programmers often write slow code. It turned out that my compiler was spending most of its time in my professor's code that while I'm sure was very mathematically pure, was a large consumer of immutable, short lifetime objects. Meaning under the hood mallocs. I should've valgrinded it but I'm certain it would've overflowed the counters (jokes)
2. If a comment spanned multiple lines in the resulting assembly, I could escape the comment and operate outside the bounds the professor setup, letting me use more assembly directives to solve the problems way easier. Ultimately we worked to fix that because usually it just means the student will try to compile part of a comment as assembly and that can be very confusing for the less assembler-error inclined. I used it for having a constant before a variable for type tagging. A 1 line solution. I believe the class's preferred way was putting the tag in a register and yadda yadda something that took a lot more finagling and effort. I did that maybe once before using my knowledge of the comment escape to do the arbitrary code injection.
[1] interpreted was written in the language we were interpreting, so as long as there were no typos or logic errors, the functionally was perfect vs running the code in the programming language. The compiler would return back a series of objects that wrapped assembly. For example, Add(R2, R2, R3). Usually pretty transparent. The framework we were given would then write out the .s file, I believe it would call some gcc or other thing, and we'd run the binary to make sure it worked.
“tailrec” is what you mention in function definitions to indicate that function is tail recursive, but it only actually worked when you called the same function itself in the return, and not any other tailrec function. The part of this which was idiocy was this would only manifest as a StackOverflowException at run time. (I found this as my language evaluation involved implementing a state machine idiomatically). If you are going to make tailrec only work as a while loop then have the compiler alert the programmer at build time, but this got all through their design and QA. Not exactly a well thought out process.
They have probably fixed this now, but instead I went off to golang, where the features are few but when they exist they are done properly.
Related
Mix-testing: revealing a new class of compiler bugs
A new "mix testing" approach uncovers compiler bugs by compiling test fragments with different compilers. Examples show issues in x86 and Arm architectures, emphasizing the importance of maintaining instruction ordering. Luke Geeson developed a tool to explore compiler combinations, identifying bugs and highlighting the need for clearer guidelines.
Boosting Compiler Testing by Injecting Real-World Code
The research introduces a method to enhance compiler testing by using real-world code snippets to create diverse test programs. The approach, implemented in the Creal tool, identified and reported 132 bugs in GCC and LLVM, contributing to compiler testing practices.
Driving Compilers
The article outlines the author's journey learning C and C++, focusing on the compilation process often overlooked in programming literature. It introduces a series to clarify executable creation in a Linux environment.
How to Compile Your Language – Guide to implement a modern compiler for language
This guide introduces programming language design and modern compiler implementation, emphasizing language purpose, syntax familiarity, and compiler components, while focusing on frontend development using LLVM, with source code available on GitHub.
Clang vs. Clang
The blog post critiques compiler optimizations in Clang, arguing they often introduce bugs and security vulnerabilities, diminish performance gains, and create timing channels, urging a reevaluation of current practices.