July 7th, 2024

Malloc broke Serenity's JPGLoader, or: how to win the lottery

An investigation in SerenityOS found a bug causing color distortion in JPG images due to a memory allocation change. The bug was fixed by adjusting color order, emphasizing the importance of careful coding practices.

Read original articleLink Icon
Malloc broke Serenity's JPGLoader, or: how to win the lottery

The investigation into a bug affecting the decoding of JPG images in SerenityOS revealed a color distortion issue, which was resolved by adjusting the color order in the code. The bug was traced back to a change related to memory allocation functions, specifically malloc_good_size(), which inadvertently altered the order of components in the JPGLoader code. This change led to non-deterministic iteration over components, causing the color distortion in images. After extensive debugging and multiple rebuilds, a commit was made to fix the bug by ensuring deterministic iteration over components. The bug had been masked by fortuitous circumstances until the memory allocation change exposed the underlying issue. The resolution highlights the importance of meticulous coding practices to prevent unexpected bugs, emphasizing the need for consistent and deterministic processes in software development.

Related

Chasing a Bug in a SAT Solver

Chasing a Bug in a SAT Solver

Adolfo Ochagavía and Prefix.dev swiftly resolved a bug in the SAT-based dependency solver, resolvo, with community input. The incident emphasizes open-source collaboration and potential debugging tool enhancements for software quality.

I found an 8 years old bug in Xorg

I found an 8 years old bug in Xorg

An 8-year-old Xorg bug related to epoll misuse was found by a picom developer. The bug caused windows to disappear during server lock, traced to CloseDownClient events. Despite limited impact, the developer seeks alternative window tree updates, emphasizing testing and debugging tools.

Mix-testing: revealing a new class of compiler bugs

Mix-testing: revealing a new class of compiler bugs

A new "mix testing" approach uncovers compiler bugs by compiling test fragments with different compilers. Examples show issues in x86 and Arm architectures, emphasizing the importance of maintaining instruction ordering. Luke Geeson developed a tool to explore compiler combinations, identifying bugs and highlighting the need for clearer guidelines.

The weirdest QNX bug I've ever encountered

The weirdest QNX bug I've ever encountered

The author encountered a CPU usage bug in a QNX system's 'ps' utility due to a 15-year-old bug. Debugging revealed a race condition, leading to code modifications and a shift towards open-source solutions.

Four lines of code it was four lines of code

Four lines of code it was four lines of code

The programmer resolved a CPU utilization issue by removing unnecessary Unix domain socket code from a TCP and TLS service handler. This debugging process emphasized meticulous code review and system interaction understanding.

Link Icon 18 comments
By @dale_glass - 7 months
This is one of the reasons why many hashtable implementations introduce a random component into the algorithm. The order of elements changes on every run, so if you accidentally rely on the order, it's going to go wrong sooner rather than later.

It also very nicely prevents security issues, since if the hashing algorithm is fixed, it can be exploited for denial of service by coming up with keys that all fall into the same bucket.

By @tedunangst - 7 months
This seems like a case where a little more debugging would have saved time over brute force bisection. The logging to print component orders had to be done eventually anyway.
By @elteto - 7 months
Kudos on the debugging but also on that commit message. It managed to condense the cause and the fix into a couple of paragraphs.
By @jeffbee - 7 months
For whatever its worth, if we wait long enough C++ will include the equivalent of `malloc_good_size`. https://github.com/cplusplus/papers/issues/18
By @Ygg2 - 7 months
Needs [2021] in title
By @russfink - 7 months
This isn’t Gunnar’s fault. The problem was whomever stored ordered data in a hash file.

I have been in this business for decades and I have run into the situation where changing the shape of memory uncovers bugs. Every time it causes many hours and days of debugging.

If programming weren’t hard, they wouldn’t need us to do it. (I’m not sure how much longer that phrase will hold up under large language models.)

By @ddtaylor - 7 months
> As a result, during the 1000 commits I ended up bisecting for, I had to build SerenityOS from scratch about 4-5 times on a 2011 laptop with Sandy Bridge Mobile. While this isn’t the fault of the project, I’m still mad.

I think SerenityOS has some folks that help each other out with resources and PCs for testing purposes.

By @jcelerier - 7 months
> I had to build SerenityOS from scratch about 4-5 times on a 2011 laptop with Sandy Bridge Mobile.

I mean, this is like trying to do Windows Vista development with a computer released in the timeframe between Windows 3.1 and Windows 95

By @userbinator - 7 months
I got Deja Vu upon seeing "Alien Lenna" and sure enough... I've seen and commented on this before: https://news.ycombinator.com/item?id=27374942 (2021)
By @amelius - 7 months
TL;DR:

> Someone used a HashTable to store objects that should be ordered, then iterated over it using the basic HashTable iterator