July 1st, 2024

Integrated assembler improvements in LLVM 19

LLVM 19 brings significant enhancements to the integrated assembler, focusing on the MC library for assembly, disassembly, and object file formats. Performance improvements include optimized fragment sizes, streamlined symbol handling, and simplified expression evaluation. These changes aim to boost performance, reduce memory usage, and lay the groundwork for future enhancements.

Read original articleLink Icon
Integrated assembler improvements in LLVM 19

LLVM 19 introduces significant improvements to the integrated assembler within the LLVM project, focusing on the MC library responsible for assembly, disassembly, and object file formats. The latest release cycle has seen enhancements to MC's internal representation, leading to improved performance and reduced compile times. Changes include merging MCAsmLayout into MCAssembler, optimizing fragment sizes, transitioning to a singly-linked list for fragment management, and streamlining symbol handling. Additionally, section handling has been refined by eliminating the section stack in favor of the "current fragment" concept. Expression evaluation has been simplified by moving from lazy to eager fragment relaxation. These optimizations aim to enhance performance, reduce memory usage, and streamline the codebase. The improvements pave the way for future enhancements and have already shown notable benefits in compile time reductions and efficiency gains, particularly in conjunction with tools like BOLT. Further work is planned to address symbol redefinition issues and strengthen the Mach-O assembler implementation.

Related

Own Constant Folder in C/C++

Own Constant Folder in C/C++

Neil Henning discusses precision issues in clang when using the sqrtps intrinsic with -ffast-math, suggesting inline assembly for instruction selection. He introduces a workaround using __builtin_constant_p for constant folding optimization, enhancing code efficiency.

How to run an LLM on your PC, not in the cloud, in less than 10 minutes

How to run an LLM on your PC, not in the cloud, in less than 10 minutes

You can easily set up and run large language models (LLMs) on your PC using tools like Ollama, LM Suite, and Llama.cpp. Ollama supports AMD GPUs and AVX2-compatible CPUs, with straightforward installation across different systems. It offers commands for managing models and now supports select AMD Radeon cards.

Why are module implementation and signatures separated in OCaml? (2018)

Why are module implementation and signatures separated in OCaml? (2018)

Separation of module implementation and signatures in OCaml enables scalable builds, creation of cmi files, and streamlined interface modifications. Emphasizing abstraction and implementation separation enhances modular programming and system reasoning.

Claude 3.5 Sonnet

Claude 3.5 Sonnet

Anthropic introduces Claude Sonnet 3.5, a fast and cost-effective large language model with new features like Artifacts. Human tests show significant improvements. Privacy and safety evaluations are conducted. Claude 3.5 Sonnet's impact on engineering and coding capabilities is explored, along with recursive self-improvement in AI development.

Meta Large Language Model Compiler

Meta Large Language Model Compiler

Large Language Models (LLMs) are utilized in software engineering but underused in code optimization. Meta introduces the Meta Large Language Model Compiler (LLM Compiler) for code optimization tasks. Trained on LLVM-IR and assembly code tokens, it aims to enhance compiler understanding and optimize code effectively.

Link Icon 5 comments
By @aengelke - 4 months
Nice summary! Additional changes I have planned:

- Removing per-instruction timers, which add a measurable overhead even when disabled (https://github.com/llvm/llvm-project/pull/97046)

- Splitting AsmPrinterHandler (used for unwind info) and DebugHandler (used also for per-instruction location information) to avoid two virtual function calls per instruction (https://github.com/llvm/llvm-project/pull/96785)

- Remove several maps from ELFObjectWriter, including some std::map (changed locally, need to make PR)

- Faster section allocation, remove ELF "mergeable section info" hash maps (although this is called just ~40 times per object file, it is very measurable in JIT use cases when compiling many small objects) (planned)

- X86 encoding in general; this consumes quite some time and looks very inefficient -- having written my own x86 encoder, I'm confident that there's a lot of improvement potential. (not started)

Some takeaways on a higher level -- most of these aren't really surprising, but nonetheless are very frequent problems(/patterns) in the LLVM code base:

- Maps/hash maps/sets are quite expensive when used frequently, and sometimes can be easily avoided, e.g., with a vector or, for pointer keys, a pointer dereference

- Virtual functions(/abstraction) calls comes at a cost, especially when done frequently

- raw_svector_ostream is slow, because writes are virtual function calls and don't get inlined (I previously replaced raw_svector_ostream with a SmallVector&: https://reviews.llvm.org/D145792)

- Frequent heap allocations are costly, especially with glibc's malloc

- Many small inefficiencies add up (=> many small improvements do, too)

By @Keyframe - 4 months
Side note, but I was looking for a pre-built binaries in releases of LLVM project. Specifically I was looking for clang+llvm releases for x86_64 linux (ubuntu preferably) in order to save some time (always had trouble compiling it) and to put it into my own `prefix` directory. It's kind of wild to see aarch64, armv7, powerpc64, x86_64_windows.. but not something like this. I am aware of https://apt.llvm.org/ and its llvm.sh - but as I said, I'd prefer it to live in its own `prefix`. Anyone knows where else there might be pre-builts? There used to be something just like that for v17, like https://github.com/llvm/llvm-project/releases/download/llvmo...
By @mncharity - 4 months
In the first sentence, "[Intro to the LLVM MC Project]" was likely intended to be a link[1].

[1] https://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html

By @matrix_overload - 4 months
TLDR: building projects with Clang is now about 4% faster due to optimizations in the way it internally handles assembly.