August 21st, 2024

Mimalloc Cigarette: Losing one week of my life catching a memory leak (Rust)

The article details a memory leak issue in a pricing engine using mimalloc, revealing that its internal bookkeeping caused memory retention. Restructuring to a single-threaded approach improved memory management.

Read original article

Mimalloc Cigarette: Losing one week of my life catching a memory leak (Rust)

The article discusses the challenges faced while debugging a memory leak in a RAM-bound pricing engine application that utilizes the mimalloc memory allocator. The author describes the technical complexities involved in managing hotel data and the unexpected out-of-memory (OOM) errors that arose despite the dataset fitting comfortably in memory. The investigation revealed that the choice of memory allocator significantly impacted the program's memory characteristics. While mimalloc is designed for performance, it led to excessive memory allocation during data refresh operations. The author spent considerable time analyzing the code, suspecting issues with the Rust implementation rather than the allocator itself. Ultimately, the problem was traced back to mimalloc's internal bookkeeping, which failed to release memory when threads went to sleep. The solution involved restructuring the program to keep all refreshing operations on a single thread, thereby ensuring proper memory management. The experience was both frustrating and enlightening, highlighting the importance of understanding allocator behavior in performance-sensitive applications.

- The author faced a memory leak issue in a pricing engine application using mimalloc.

- The choice of memory allocator significantly affected the application's memory usage.

- Debugging revealed that mimalloc's bookkeeping could lead to memory not being released when threads were inactive.

- The solution involved restructuring the code to manage memory refresh operations on a single thread.

- The experience underscored the complexities of memory management in performance-critical applications.

Malloc() and free() are a bad API (2022)

The post delves into malloc() and free() limitations in C, proposing a new interface with allocate(), deallocate(), and try_expand(). It discusses C++ improvements and emphasizes the significance of a robust API.

Debugging an evil Go runtime bug: From heat guns to kernel compiler flags

Encountered crashes in node_exporter on laptop traced to single bad RAM bit. Importance of ECC RAM for server reliability emphasized. Bad RAM block marked, GRUB 2 feature used. Heating RAM tested for stress behavior.

The Process That Kept Dying: A memory leak murder mystery (node)

An investigation into a recurring 502 Bad Gateway error on a crowdfunding site revealed a memory leak caused by Moment.js. Updating the library resolved the issue, highlighting debugging challenges.

Phantom Menance: memory leak that wasn't there

The author's investigation into a perceived memory leak in a Rust application revealed it was a misunderstanding of misleading Grafana metrics, emphasizing the importance of accurate metric calculation in debugging.

Linux Memory Overcommit (2007)

Linux's memory overcommit behavior can cause application crashes due to delayed memory access. Adjusting `vm.overcommit_memory` and `vm.overcommit_ratio` settings can improve management and prevent unexpected terminations.

16 comments

By @hinkley - 8 months

We had learned helplessness on a drag and drop bug in jquery UI. I had like three hours every second or third Friday and would just step through the code trying to find the bug. That code was so sketchy the jquery team was trying to rewrite it from scratch one component at a time, and wouldn’t entertain any bug discussions on the old code even though they were a year behind already.

After almost six months, I finally found a spot where I could monkey patch a function to wrap it with a short circuit if the coordinates were out of bounds. Not only fixed the bug but made drag and drop several times faster. Couldn’t share this with the world because they weren’t accepting PRs against the old widgets.

I’ve worked harder on bug fixes, but I think that’s the longest I’ve worked on one.

By @kibwen - 8 months

Level 1 systems programmer: "wow, it feels so nice having control over my memory and getting out from under the thumb of a garbage collector"

Level 2 systems programmer: "oh no, my memory allocator is a garbage collector"

By @Arnavion - 8 months

jemalloc also has its own funny problem with threads - if you have a multi-threaded application that uses jemalloc on all threads except the main thread, then the cleanup that jemalloc runs on main thread exit will segfault. In $dayjob we use jemalloc as a sub-allocator in specific arenas. (*) The application itself is fine in production because it allocates from the main thread too, but the unit test framework only runs tests in spawned threads and the main thread of the test binary just orchestrates them. So the test binary triggers this segfault reliably.

( https://github.com/jemalloc/jemalloc/issues/1317 Unlike what the title says, it's not Windows-specific.)

(*): The application uses libc malloc normally, but at some places it allocates pages using `mmap(non_anonymous_tempfile)` and then uses jemalloc to partition them. jemalloc has a feature called "extent hooks" where you can customize how jemalloc gets underlying pages for its allocations, which we use to give it pages via such mmap's. Then the higher layers of the code that just want to allocate don't have to care whether those allocations came from libc malloc or mmap-backed disk file.

By @CraigJPerry - 8 months

Tangent: what’s the ideal data structure for this problem?

If there were 20million rooms in the world with a price for each day of the year, we’d be looking at around 7billion prices per year. That’d be say 4Tb of storage without indexes.

The problem space seems to have a bunch of options to partition - by locality, by date etc.

I’m curious if there’s a commonly understood match for this problem?

FWIW with that dataset size, my first experiments would be with SQL server because that data will fit in ram. I don’t know if that’s where I’d end up - but I’m pretty sure it’s where I’d start my performance testing grappling with this problem.

By @loeg - 8 months

Sort of tl;dr: mimalloc doesn't actually free memory in a way that it can be reused on threads other than the one that allocated it; the free call marks regions for eventual delayed reclaim by the original thread. If the original thread calls malloc again, those regions are collected (1/N malloc calls). Or (C) you can explicitly invoke mi_collect[1] in the allocating thread (the Rust crate does not seem to expose this API).

[1]: https://github.com/microsoft/mimalloc/blob/dev/src/heap.c#L1...

By @rurban - 8 months

The Annotated C++ Reference Manual:

“C programmers think memory management is too important to be left to the computer. LISP programmers think memory management is too important to be left to the user.”

By @IceTDrinker - 8 months

PSA: do not use floating point for monetary amounts

By @zokier - 8 months

I wonder if there is something that could be done on language design level to have better "sympathy" to memory allocation, i.e. built upon having mmap/munmap as primitives instead of malloc/free; where language patterns are built around allocating pages instead of arbitrarily sized objects. Probably not practical for general high-level languages, but for e.g. embedded or high-performance stuff might make sense?

By @PaulDavisThe1st - 8 months

A perfect demonstration of how many of harder problems we face writing (especially non-browser-based) software are in fact not addressed by language changes.

The concept of memory that is allocated by a thread and can only be deallocated by that thread is useful and valid, but as TFA demonstrates, can also cause problems if you're not careful with your overall architecture. If the language you're using even allows you to use this concept, it almost certainly will not protect you from having to get the architecture corect.

By @znpy - 8 months

> Allocators have different characteristics for a reason - they do some things differently between each other. What do you think mimalloc does that could account for this behavior?

Interestingly, it would seem that Java programmers play with garbage collectors while Rust programmers play with memory allocators.

By @malkia - 8 months

In C++, your https://en.cppreference.com/w/cpp/memory/new/new_handler should call mi_collect.

By @Exuma - 8 months

I really love the design of this blog

By @bsder - 8 months

Welcome to systems programming. Allocators are invisible--until they aren't.

By @om8 - 8 months

TLDR: use shitty allocators, win shitty memory leaks

Malloc() and free() are a bad API (2022)

Debugging an evil Go runtime bug: From heat guns to kernel compiler flags

The Process That Kept Dying: A memory leak murder mystery (node)

An investigation into a recurring 502 Bad Gateway error on a crowdfunding site revealed a memory leak caused by Moment.js. Updating the library resolved the issue, highlighting debugging challenges.

Mimalloc Cigarette: Losing one week of my life catching a memory leak (Rust)

Related

Malloc() and free() are a bad API (2022)

Debugging an evil Go runtime bug: From heat guns to kernel compiler flags

The Process That Kept Dying: A memory leak murder mystery (node)

Phantom Menance: memory leak that wasn't there

Linux Memory Overcommit (2007)

Related

Malloc() and free() are a bad API (2022)

Debugging an evil Go runtime bug: From heat guns to kernel compiler flags

The Process That Kept Dying: A memory leak murder mystery (node)

Phantom Menance: memory leak that wasn't there

Linux Memory Overcommit (2007)