August 12th, 2024

Book – Writing a C Compiler: Build a Real Programming Language from Scratch 2024

Writing a C Compiler by Nora Sandler, releasing in July 2024, is a 792-page guide for building a C compiler, covering lexing, parsing, and code generation with a practical approach.

Read original articleLink Icon
CuriosityEnthusiasmFrustration
Book – Writing a C Compiler: Build a Real Programming Language from Scratch 2024

Writing a C Compiler by Nora Sandler is a comprehensive guide aimed at demystifying the process of compiler construction for programmers. Set to be released in July 2024, the book spans 792 pages and is designed for readers with no prior experience in compiler design or assembly language. It offers a hands-on approach, allowing readers to build a compiler for a significant subset of the C programming language. The book covers essential topics such as lexing, parsing, program analysis, code generation, and optimization techniques. Each chapter introduces new features, gradually enhancing the compiler's capabilities. The algorithms are presented in pseudocode, making it adaptable for implementation in various programming languages. Sandler, a software engineer with a background in computer science, aims to make the complex subject of compilers accessible and engaging. The book has received positive reviews for its practical approach and thoroughness, making it suitable for both beginners and experienced developers looking to deepen their understanding of compilers.

- The book provides a step-by-step guide to building a C compiler from scratch.

- It covers key concepts like lexing, parsing, and code generation.

- Algorithms are presented in pseudocode for flexibility in implementation.

- The author emphasizes a practical approach to compiler design.

- Positive reviews highlight the book's accessibility and thoroughness.

AI: What people are saying
The comments on "Writing a C Compiler" by Nora Sandler reveal a mix of experiences and opinions from readers.
  • Readers appreciate the practical approach of the book, noting it encourages experimentation and hands-on learning.
  • There is a discussion about the implementation language (OCaml) and its accessibility for C programmers, with some suggesting that a simpler language might be more suitable for beginners.
  • Several comments highlight the book's focus on code generation, which is often overlooked in other compiler resources.
  • Some readers express a desire for additional resources or books that cover related topics, such as building a debugger or a C++ compiler.
  • The book's structure, including test suites for each chapter, is praised for enhancing the learning experience.
Link Icon 22 comments
By @signaru - 8 months
Have read the first few chapters and it expects that you either read the accompanying source code or implement your own and pass the tests. The pseudo code presented in the book often look like function calls with the inner details not there in the book. Furthermore, as already pointed out in another comment, the available implementation is in OCaml, which is probably not something many C programmers have experience with.

Nevertheless, I think I'm learning more from this book than most other books I've tried before that are far more theoretical or abstract. I'm still eager to reach the chapter on implementing C types. I think it's a good book, but it requires more effort than something like Crafting Interpreters or Writing a Compiler/Interpreter in Go, while also covering topics not in those books.

By @synack - 8 months
I’ve been working through this book implementing the compiler in Ada. So far, I’m really enjoying it. The book doesn’t make too many assumptions about implementation details, leaving you free to experiment and fill in the blanks yourself.

It feels like a more advanced version of Crafting Interpreters.

I haven’t looked at the OCaml implementation at all. The text and unit tests are all you need.

Discussion on the Ada Forum: https://forum.ada-lang.io/t/writing-a-c-compiler/1024

By @francogt - 8 months
I see many comments saying that the book implements the C compiler in ocaml. In the introduction the author states that the book actually uses pseudo code so you are actually free to implement it in any language. The only recommendation is that you use a language with pattern matching because the pseudo code makes heavy use of it. The reference implementation is in ocaml.
By @WalterBright - 8 months
I learned how to write a compiler by studying BYTE magazine in the 70's which published the source to a complete Pascal compiler as an article!

https://archive.org/details/byte-magazine-1978-09 (part 1)

All 3 parts of Tiny Pascal:

https://albillo.hpcalc.org/publications/Easter%20Egg%20-%20T...

By @hasbot - 8 months
So what's different about writing a compiler in 2024 than say 10, 20, or 30 years ago? When I started writing compilers in the 80's and 90's lex/flex and yacc/bison were popular. ANTLR came out but I never had a chance to use it. Everything after lexing and parsing was always hand rolled.
By @jerjerjer - 8 months
I uh misread the title and thought someone built a C compiler in Scratch.

On topic, though: wouldn't a simpler language (maybe even a pseudo language) be a better target for a first learning compiler. I understand they don't build a full C compiler, but still. It looks to me like there's a lot of complexity add from choosing such a lofty target.

By @fuhsnn - 8 months
chibicc[0] complement this book nicely, in addition to a basic compiler, it guides you through writing the preprocessor and driver, which, although not addressed much in literature, are the missing link between the compiler built from the book and real C projects.

[0] https://github.com/rui314/chibicc

By @carom - 8 months
I took a compilers course in university and the course culminated in having a compiler for C Minus (a subset of C). The professor noted how each year the line count of the compilers was dropping as students found ways libraries or languages that made it easier. I think the evolution was Java -> Antlr -> Python. I used OCaml and emitted LLVM and blew that metric out of the water.
By @the_panopticon - 8 months
In Ocaml, interesting. I was similarly surprised when I learned that the firs Rust compiler was written in Ocaml, too https://users.rust-lang.org/t/understanding-how-the-rust-com...
By @Coolbeanstoo - 8 months
This looks cool, been interested in learning more about compilers since I did the basics in college. Lots of things seem to focus on making interpreters and never make it to the code generation part so its nice to see that this features information about that.
By @shoggouth - 8 months
It also will be available via Amazon after August 20, 2024.

https://www.amazon.com/Writing-Compiler-Programming-Language...

By @sergius - 8 months
By @tzs - 8 months
I don't really need to know how to build a compiler, and I've got enough other "don't need but am doing out of curiosity" things going on that I don't need any more of those, but if it wasn't $70 I'd probably get it anyway. It would be interesting to compare to the last building a compiler book I read back and see how things have changed. Based on the comments here a lot has changed.

That last book was Allen Holub's "Compiler Design in C", which is from 1990. Here's how the blurb on the back describes it:

> Allen I. Holub's Compiler Design in C offers a comprehensive, new approach to compilers that proves to be more accessible to computer science students than the other strictly mathematical books.

> With this method in mind, the book features three major aspects:

> (1) The author develops fully functional versions of lex and yacc (tools available in the UNIX® operating system to write compilers), (2) he uses lex and yacc to develop a complete C compiler that includes parts of C that are normally left out of compiler design books (eg., the complete C "type" system, and structures), and (3) the version of yacc developed here improves on the UNIX version of yacc in two ways (error recovery and the parser, which automatically produces a window-oriented debugging environment in which the parse and value stacks are visible).

It's out of print, but the author has made a searchable PDF available on his website [1]. I found it quite useful.

Holub seems to like the "learn by doing" approach. He's got another book, "Holub on Patterns" that teaches all the design patterns from the gang of four book organically by developing two programs that together use all of those patterns. The two programs are an embedded SQL interpreter and a GUI application for Conway's Game of Life.

PS: Ooh. It occurred to me that No Starch Press books are often available on O'Reilly Learning. I checked and this one is there. So I guess it is going on my "don't need but am doing out of curiosity" pile after all.

[1] https://holub.com/compiler/

By @whartung - 8 months
What approach does this book take to error recovery?

Several "compiler light" style articles and books kind of walk over that part, and it can be non-trivial to do properly, especially with modern expectations.

I remember way back in the day, one of the early C compilers for the PDP, and, honestly, it could almost be argued that ed(1) had better error messages than what that thing produced.

A lot of simple compilers hit an error and just give up.

So, just curious what the approach was in this book.

By @badsectoracula - 8 months
Weird that this is about building a C compiler[0] in OCaml. I expected the implementation language to also be C both for consistency but also because i'm willing to bet that there are more people who can read C than OCaml.

[0] actually from the readme in the github repo[1] it seems to be a C subset, not all of C

[1] https://github.com/nlsandler/nqcc2

By @quibono - 8 months
I swear I've seen this cover before... is this a new release or an updated edition of an older book?
By @sunday_serif - 8 months
I’m working through this book now and really enjoying it!

Each chapter of the book includes a test suite to run against the code you’ve written.

In some ways, the tests in this book feel very similar to the labs in the book Computer Systems: A programmers perspective — which is high praise!

By @alok-g - 8 months
I would love to see a book that talks about going all the way to generate machine code, i.e., not stopping at generation of assembly.

Alternatively, I would like to learn about not just how to make a compiler, but also simultaneously a debugger, hot-reloading, etc.

By @sim7c00 - 8 months
cool, remember some tutorials online i think from the same author (not 100% sure) doing stuff around c compilation in python. shame its not in a language i want to learn. the other book on compilers i got is almost to heavy to lift! :D
By @i_don_t_know - 8 months
Somewhat unrelated: Is there a book that walks you through building a database system from storage to queries, optimizer, execution, indexing, transactions, etc?
By @sylware - 8 months
I wonder why there is not the same book for c++... mmmmh... I really wonder... (irony).
By @viraj_shah - 8 months
Dropping this one here! (no affiliation)

https://www.linuxfromscratch.org/

"Linux From Scratch (LFS) is a project that provides you with step-by-step instructions for building your own custom Linux system, entirely from source code."