August 7th, 2024

Maximal Min() and Max()

Recent Linux kernel developments reveal that complex min() and max() macros increase compilation times, prompting proposals for new macros to simplify code while balancing type safety and efficiency.

Read original article

FrustrationSkepticismAmusement

Recent developments in the Linux kernel have highlighted issues with the preprocessor macros min() and max(), which are extensively used in the kernel's C code. These macros, originally designed for simplicity, have evolved into complex constructs that can lead to significant increases in compilation time. A specific case was identified where a single line of code expanded to an excessive 47MB of preprocessor output due to nested calls to min() and min3(). This complexity has raised concerns among developers about the efficiency of kernel builds. In response, kernel developers have proposed changes to mitigate the problem, including the introduction of new macros that simplify the code while sacrificing some type safety. Linus Torvalds expressed concerns about the overly clever nature of the existing macros and suggested reverting to simpler forms, although many parts of the kernel now depend on the newer functionality. The ongoing discussion reflects the balance between maintaining type safety and ensuring efficient compilation times in kernel development.

- The complexity of min() and max() macros has led to increased kernel compilation times.

- A specific line of code expanded to 47MB of preprocessor output due to nested macro calls.

- New macros have been introduced to simplify code and improve compilation efficiency.

- Developers are concerned about the trade-off between type safety and compilation speed.

- Ongoing discussions among kernel developers aim to address these issues while maintaining functionality.

Background of Linux's "file-max" and "nr_open" limits on file descriptors (2021)

The Unix background of Linux's 'file-max' and 'nr_open' kernel limits on file descriptors dates back to early Unix implementations like V7. These limits, set during kernel compilation, evolved to control resource allocation efficiently.

What are the ways compilers recognize complex patterns?

Compilers optimize by recognizing patterns like popcount, simplifying code for efficiency. LLVM and GCC use hardcoded patterns to match common idioms, balancing compile-time speed with runtime gains in critical code sections.

Tail Recursion for Macros in C

The blog discusses tail recursion for macros in C, introducing __VA_TAIL__() to enable real recursion, overcoming inefficiencies of traditional recursive macro calls. It showcases how tail recursion improves processing efficiency in macros.

Maximal Min() and Max()

Recent changes to the min() and max() macros in the Linux kernel have increased compilation times, prompting discussions on reverting to simpler macros to improve efficiency and reduce code expansion.

Clang vs. Clang

The blog post critiques compiler optimizations in Clang, arguing they often introduce bugs and security vulnerabilities, diminish performance gains, and create timing channels, urging a reevaluation of current practices.

AI: What people are saying

The discussion around the Linux kernel's min() and max() macros reveals several key concerns and opinions among commenters.

Many commenters emphasize the complexity of the current macros, which can lead to increased compilation times and bloated code.
There is a debate on whether to continue using macros or to switch to function calls, with some arguing that inlining could mitigate performance issues.
Some suggest that the language's limitations should be accepted, advocating for type-specific implementations instead of generic macros.
Concerns about type safety and correctness are prevalent, with calls for thorough testing of the macros due to their complexity.
Several commenters express frustration with the use of macros in C, suggesting that they complicate code unnecessarily.

16 comments

By @seanhunter - 9 months

For everyone with a “why don’t they just…?”-type suggestion, it’s worth carefully rereading TFA and considering what the macros do now and the problem they are trying to solve.

The macros now do type-safe comparisons that work correctly with combinations of different argument types where this is possible, work in a constant context (eg defining array bounds), work correctly in the face of implicit type coercion and include a 3-way min and max that have these same desirable properties (same as the two way min and max).

The problem(s) are it’s a bit slow to compile and the preprocessor expansion (although not necessarily the final generated assembly - didn’t see anyone saying that) is a bit bloated.

So when you make a “why don’t they just…. ?”-suggestion, make sure your suggestion is at least as good as what they have now in terms of the desirable functionality and correctness and then tackles the actual problems in some way. I’m not sure all of the suggestions I have seen here and in the lwn comments succeed at meeting those two criteria.

By @jonathrg - 9 months

That series of macros is a nice demonstration of the incredible effort it takes to attempt the most basic generic programming in C. Perhaps it would be more productive to just accept the limitations of the language and define a version of `min` and `max` for each type.

By @layer8 - 9 months

This reminds me of how ~25 years ago I wrote little C macro library to perform safe (non-overflowing, and/or saturating) integer arithmetics and comparisons, however the small application I wanted to use it in had the compiler crash after 2-3 hours due to insufficient RAM+swap when compiling a single source file using those macros. It turned out the macro expansions made the translation unit grow to GB size, which must have been 4-5 orders of magnitude over its original size.

By @mastax - 9 months

My question is, why isn't there a `min() and `max()` in the standard library? Even accepting C's philosophy of a minimalist stdlib, these feel like uncontroversial functionality to me. TFA shows there's enough complexity involved in doing it correctly that it makes sense to provide an implementation rather than have everyone write the same often-subtly-incorrect macros. They could use a `__builtin_cmp(type, a, b)` which can use the correct type-casts and prevent double-evaluation without needing any macro weirdness.

By @Joker_vD - 9 months

    (void) (&_x == &_y);

Is this... a check for type-compatibility? I don't think actually produces a compilation error if the types are incompatible.

By @account42 - 9 months

What C programmers will do to avoid using even a little bit of C++.

By @wakamoleguy - 9 months

Why are these defined as macros at all? A function call would come with overhead, of course, but wouldn't compilers be able to inline that anyways?

By @usefulcat - 9 months

> some of the changes to the macros made some developers (including Bergmann) nervous

These macros are now so complex that they're reluctant to touch them. Seems like there is a clear need for some thorough tests here? This is exactly the sort of thing that is eminently testable.

By @wood_spirit - 9 months

Could not the main compilers get involved and add builtins so an ifdef makes the common path on the main compilers like gcc use builtins and the slowdown only hurts those using the less mainstream compilers? If it takes off the other compilers would quickly add the feature.

By @baggy_trough - 9 months

47MB of code from a min/max macro is simply hilarious!

By @up2isomorphism - 9 months

Every time when such thing happens, then it becomes a language suggestion opportunity. But for those who suggest another language, are you going to replace a 50M line code base because of you want have a fancy min / max? I don’t think this is a responsible suggestion.

By @metrognome - 9 months

There are plenty of projects smaller than the Linux kernel that have developed and employed DSLs (to varying degrees of success, I'll grant). I wonder, are there any languages out there designed specifically for kernel programming?

Given the number of preprocessor hacks used in the kernel, and the amount of GCC-specific behavior that the codebase depends on, it seems like they are already halfway there.

By @Borg3 - 9 months

This also shows how programmers sometimes use far too fancy stuff for what they do. If you are doing some intense computation, split it, or do it and later slap sanity check inside. It will be even more readable. If you just play with bunch of vars, sure.. min/max macros can be easier to read.

By @kibwen - 9 months

The most pleasant C codebases that I have read essentially banned macros outside of includes, conditional compilation, and named constants. Please just stop trying to use macros to metaprogram in C.

By @38 - 9 months

what a pathetic language. modern languages have a fully generic max function, thats also type safe

https://godocs.io/builtin#max

Maximal Min() and Max()

Related

Background of Linux's "file-max" and "nr_open" limits on file descriptors (2021)