August 27th, 2024

Bootstrappable Builds

Bootstrappable builds improve trust and security in computing by minimizing reliance on opaque binaries. Developers are encouraged to adopt best practices and collaborate through community discussions to address bootstrapping challenges.

Read original articleLink Icon
Bootstrappable Builds

The text discusses the concept of bootstrappable builds, particularly in the context of compilers like GCC. It highlights the challenge of creating compilers that can compile themselves, which often leads to reliance on pre-built binaries that lack transparency. This reliance poses security risks, as opaque binaries cannot be audited, threatening user security and freedom. The document emphasizes the importance of minimizing bootstrap binaries to enhance trust in computing platforms. It outlines the benefits of bootstrappable implementations, best practices for developers facing bootstrapping issues, and the necessity of collaboration on projects aimed at resolving these challenges. Additionally, it encourages participation in community discussions through mailing lists and IRC channels for ongoing updates and collaboration.

- Bootstrappable builds enhance trust and security in computing platforms.

- Reliance on opaque binaries poses risks to user security and freedom.

- Best practices exist for developers to address bootstrapping challenges.

- Collaboration is essential for solving issues in compilers and build systems.

- Community engagement is encouraged through mailing lists and IRC channels.

Link Icon 5 comments
By @er4hn - 5 months
The big issue with bootstrappable builds is how to get started and have good examples. This is an ambitious goal, like landing on the moon, and takes a lot to get there. My understanding of this has been you need to (a) Be able to have a compiler that can be compiled from understandable code, which itself may require a set of increasingly complex compilers. I've heard this referred to before as a "compiler pilgrimage" but I can't find where I heard that term. (b) Then you need to be able to build the code with that compiler / dependencies. This is a pretty well solved problem these days assuming you can pin all your dependencies and transitive dependencies. (c) Then this all needs to be reproducible so that you can actually trust the output and that is a pretty hard problem today.
By @mikewarot - 5 months
The story referenced as part of the motivation for the project[1] is pretty chilling. The laws of physics can put a lower limit on things for you if you have an old school analog oscilloscope handy to watch for network packets.

If you have old school TTL, EPROMs, RAM, and time, you could built a CPU you can test all the parts of, and trust. You could even work your way up to floppy disks, and an analog CRT display.

Once you want to ramp up the speed and complexity, things get dicey. I have ideas that would help, but nothing provably secure.

[1] https://www.teamten.com/lawrence/writings/coding-machines/

By @transpute - 5 months
> Current versions of GCC are written in C++, which means that a C++ compiler is needed to build it from source. GCC 4.7 was the last version of the collection that could be built with a plain C compiler, a much simpler task.

Which C++ compiler was used to build GCC 4.8?

By @andy_xor_andrew - 5 months
regarding the "security" aspect, I'm interested in what an attack vector would look like against a build system

like, say you are building code, and all the below functions are compilers, and * denotes an evil compiler. Every link in the chain is a compiler building another compiler, until the last node which builds the code.

A() -> B() -> Evil*() -> D() -> E(code) -> binary

how in the world would the evil compiler in this situation inject something malicious into the final binary?

By @mcosta - 5 months
Then the trust is in your silicon. Not only the CPU. The network card, hard drive, memory controller, PCI bus...