A tiny self-remaking C program
The article introduces a self-rebuilding C program using a minimal shell script, emphasizing the build process as computation, the importance of caching, and the need for improved security scrutiny in build systems.
Read original articleThe article discusses a novel approach to creating a self-rebuilding C program using a minimal shell script. The author presents a one-file C program that can rebuild itself under specific environmental conditions, particularly with the GNU Coreutils 8.30 or FreeBSD. The script is described as a hack and not intended for serious use. The author reflects on the conceptual framework of build systems, suggesting that the build process should be viewed as a computation similar to execution. This perspective emphasizes the importance of caching intermediate results to enhance execution speed. The author raises questions about integrating testing into the build process, especially in light of recent security concerns, such as the xz backdoor incident. The discussion highlights the need for improved scrutiny in build systems to prevent vulnerabilities while acknowledging the complexity involved in ensuring security.
- The article presents a self-rebuilding C program using a minimal shell script.
- It emphasizes viewing the build process as a computation akin to execution.
- Caching intermediate results is suggested to improve execution speed.
- The author questions the integration of testing into build systems due to security concerns.
- The need for better scrutiny in build processes is highlighted to prevent vulnerabilities.
Related
Towards Idempotent Rebuilds?
The blog post explores idempotent rebuilds in Debian and Ubuntu packages. It introduces debdistrebuild, aiming to enhance reproducibility by analyzing rebuild differences. Challenges like build paths and dependencies are highlighted, emphasizing trust in binary distributions.
Driving Compilers
The article outlines the author's journey learning C and C++, focusing on the compilation process often overlooked in programming literature. It introduces a series to clarify executable creation in a Linux environment.
Metaprogramming in Bash
Adam Young's blog post on metaprogramming in Bash emphasizes efficient scripting for managing multiple machines, advocating for reusable functions, dynamic variable assignment, and potential Ansible integration for improved automation.
Bootstrappable Builds
Bootstrappable builds improve trust and security in computing by minimizing reliance on opaque binaries. Developers are encouraged to adopt best practices and collaborate through community discussions to address bootstrapping challenges.
Shell Has a Forth-Like Quality (2017)
The blog post compares the Unix shell's Forth-like qualities and higher-order programming with systemd's role in Linux boot processes, advocating for an improved shell while favoring daemontools' modular design.
#if 0
TMP=$(mktemp -d);
c++ -std=c++11 -o ${TMP}/a.out ${0} && ${TMP}/a.out ${@:1}; RV=${?};
rm -rf ${TMP};
exit ${RV};
#endif
#include <iostream>
int main()
{
std::cout << "Hello, world!\n";
}
(the trailing semi-colons in the script part is to make my editor indent the C++ code properly)If you aren't familiar, quines are programs which produce their own source as their only output. They're quite interesting and worth a dive if you haven't explored them before.
My personal favorites are the radiation-hardened variety, which still produce the original pre-modified source even when any single character is removed before the program is run.
I have long been thinking the same. And also: "Running tests is the first stage of deploying in production" etc.
In other words: There is often a whole dependency graph between various stages (install toolchain, install 3rd-party dependencies, do code generation, compile, link/bundle, test, run, …) and each of those stages should ideally be a pure function mapping inputs to outputs. In my experience, we often don't do a good job making this graph explicit, nor do we usually implement build steps as pure functions or think of build systems as another piece of software which needs to be subject to the same quality criteria that we apply to any other line of code we write. As a result, developer experience ends up suffering.
Large JavaScript projects are particularly bad in this regard. Dependencies, auto-generated code and build output live right alongside source code, essentially making them global state from the point of view of the build system. The package.json contains dozens of "run" commands which more often than not are an arcane mix of bash scripts invoking JS code. Even worse, those commands typically need to be run in juuust the right order because they all operate on the same global state. There is no notion of one command depending on another. No effort put into isolating build tasks from each other and making them pure. No caching of intermediate results. Pipelines take ages even though they wouldn't have to. Developers get confused because a coworker introduced a new npm command yesterday which now needs be to run before anything else. Ugghhh.
It's an interesting exercise working out what is both a comment in the target language but is also an executable shell script. For C it's reasonably straightforward but for OCaml it's quite subtle:
https://libguestfs.org/nbdkit-cc-plugin.3.html#C-plugin-as-a... https://libguestfs.org/nbdkit-cc-plugin.3.html#Using-this-pl...
I'm guessing that this (IBM) example is setting the delimeter to '@@' to avoid problems with the comment - JCL also understands the '/*' sequence. I've not seen it used with other languages (Cobol etc.)
//jobname JOB acctno,name...
//COMPILE EXEC PGM=CCNDRVR,
// PARM='/SEARCH(''CEE.SCEEH.+'') NOOPT SO OBJ'
//STEPLIB DD DSNAME=CEE.SCEERUN,DISP=SHR
// DD DSNAME=CEE.SCEERUN2,DISP=SHR
// DD DSNAME=CBC.SCCNCMP,DISP=SHR
//SYSLIN DD DSNAME=MYID.MYPROG.OBJ(MEMBER),DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSIN DD DATA,DLM=@@
#include <stdio.h>
⋮
int main(void)
{
/* comment */
⋮
}
@@
//SYSUT1 DD DSN=...
⋮
//*
https://en.wikipedia.org/wiki/Job_Control_Language#In-stream...*Debian is pretty far off from this vision (if we also want performant execution), but I wonder how do the Gentoo, ArchLinux and Nix fare in this regard? Is this something that could be viably built with their current packaging formats?
/*usr/bin/env echo 'Hello World!' #*/
Even if, for some reason, there are multiple `usr` folders, the use of `env` means it will eventually call the executable.As for getting rid of the shebang - swapping the ! with a / means that the line and character counts don't change so you get meaningful error messages.
https://en.wikipedia.org/wiki/C_shell
It seems like you are on your way to making the C++ shell.
"$0".bin: -c: line 0: unexpected EOF while looking for matching `''
"$0".bin: -c: line 1: syntax error: unexpected end of file
thats the one i've been using. feel free to adopt and change.
Related
Towards Idempotent Rebuilds?
The blog post explores idempotent rebuilds in Debian and Ubuntu packages. It introduces debdistrebuild, aiming to enhance reproducibility by analyzing rebuild differences. Challenges like build paths and dependencies are highlighted, emphasizing trust in binary distributions.
Driving Compilers
The article outlines the author's journey learning C and C++, focusing on the compilation process often overlooked in programming literature. It introduces a series to clarify executable creation in a Linux environment.
Metaprogramming in Bash
Adam Young's blog post on metaprogramming in Bash emphasizes efficient scripting for managing multiple machines, advocating for reusable functions, dynamic variable assignment, and potential Ansible integration for improved automation.
Bootstrappable Builds
Bootstrappable builds improve trust and security in computing by minimizing reliance on opaque binaries. Developers are encouraged to adopt best practices and collaborate through community discussions to address bootstrapping challenges.
Shell Has a Forth-Like Quality (2017)
The blog post compares the Unix shell's Forth-like qualities and higher-order programming with systemd's role in Linux boot processes, advocating for an improved shell while favoring daemontools' modular design.