The weirdest QNX bug I've ever encountered
The author encountered a CPU usage bug in a QNX system's 'ps' utility due to a 15-year-old bug. Debugging revealed a race condition, leading to code modifications and a shift towards open-source solutions.
Read original articleThe blog post describes the author's encounter with a peculiar bug in firmware updates. The bug caused high CPU usage due to an infinite loop in the 'ps' utility on a QNX system. Through debugging, the author traced the issue to a 15-year-old bug in the closed-source 'ps' binary, which was fixed by modifying the source code of the utility. The bug surfaced due to changes in boot timing, leading to a race condition. The author decided to eliminate the use of 'ps' in non-interactive code to prevent the bug from reoccurring. Lessons learned include the persistence of old bugs, the impact of subtle changes on bug manifestation, and the importance of loop termination criteria. The closed-source nature of the QNX ecosystem posed challenges in debugging, highlighting the value of open-source solutions. The author's fix prevented the recurrence of the specific update problem. The post concludes with reflections on the challenges of closed-source systems and the need for robust coding practices to avoid similar issues in the future.
Related
Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code
A developer fixed a seven-year-old bug in an iPad accessory causing missed MIDI messages by optimizing a modulo operation. The bug's resolution improved the audio processor's efficiency significantly.
Vulnerability in Popular PC and Server Firmware
Eclypsium found a critical vulnerability (CVE-2024-0762) in Intel Core processors' Phoenix SecureCore UEFI firmware, potentially enabling privilege escalation and persistent attacks. Lenovo issued BIOS updates, emphasizing the significance of supply chain security.
I found an 8 years old bug in Xorg
An 8-year-old Xorg bug related to epoll misuse was found by a picom developer. The bug caused windows to disappear during server lock, traced to CloseDownClient events. Despite limited impact, the developer seeks alternative window tree updates, emphasizing testing and debugging tools.
The Dirty Pipe Vulnerability
The Dirty Pipe Vulnerability (CVE-2022-0847) in Linux kernel versions since 5.8 allowed unauthorized data overwriting in read-only files, fixed in versions 5.16.11, 5.15.25, and 5.10.102. Discovered through CRC errors in log files, it revealed systematic corruption linked to ZIP file headers due to a kernel bug in Linux 5.10. The bug's origin was pinpointed by replicating data transfer issues between processes using C programs, exposing the faulty commit. Changes in the pipe buffer code impacted data transfer efficiency, emphasizing the intricate nature of kernel development and software component interactions.
CVE-2021-4440: A Linux CNA Case Study
The Linux CNA mishandled CVE-2021-4440 in the 5.10 LTS kernel, causing information leakage and KASLR defeats. The issue affected Debian Bullseye and SUSE's 5.3.18 kernel, resolved in version 5.10.218.
"Then QNX was bought, source code access was revoked and the community largely withered away. Questions were increasingly asked via private support tickets directly to QNX, locked away from the public. QNX know-how becomes harder and harder to acquire, open source software for modern QNX releases is essentially non-existent and the driver situation is a catastrophe. The QNX kernel is the most beautiful and interesting kernel I have ever had the pleasure of working with, but it lies in the shackles of corporate ownership."
It's sad.
QNX was originally an independent company. During that period, anyone could get a free copy of QNX for personal use. It wasn't open source, but it was available. It's POSIX-compatible, so it was a supported target for Gnu, Firefox, and Eclipse. We used QNX for our DARPA Grand Challenge vehicle in 2003-2005, and all that code was developed on desktop QNX.
Then QNX was acquired by Harmon, the successor to Harmon-Kardon, which once made home audio components and pivoted to car audio. They were thinking car infotainment. Harmon didn't really know what to do with an operating system, especially since the big market was systems for industrial control and point of sale. So eventually they opened the source.
Then QNX was acquired by Blackberry, the early smartphone company. They closed the source, very suddenly. They even killed off the free version for personal and educational use. So all third party open source development stopped. Blackberry eventually shipped a phone that ran QNX, but they were not powerful enough as a company to keep a third phone standard going. So Blackberry went to Android.
Blackberry killed off the self-hosted desktop environment, and users now had to cross-compile from Windows.
And QNX became more of a niche product than ever.
Their moat is supposedly their ASIL certification, but I see that value shrinking more and more over time for the following reasons:
1. If your product has a software-related failure, customers won't care about all of your certifications. Only the end product.
2. I'm not convinced that the QNX kernel is less buggy than the Linux kernel. Also, most failures don't tend to be kernel related.
Other than that, the blog post was very interesting, I learned a bit of history of QNX, and concluded that I should avoid it.
Related
Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code
A developer fixed a seven-year-old bug in an iPad accessory causing missed MIDI messages by optimizing a modulo operation. The bug's resolution improved the audio processor's efficiency significantly.
Vulnerability in Popular PC and Server Firmware
Eclypsium found a critical vulnerability (CVE-2024-0762) in Intel Core processors' Phoenix SecureCore UEFI firmware, potentially enabling privilege escalation and persistent attacks. Lenovo issued BIOS updates, emphasizing the significance of supply chain security.
I found an 8 years old bug in Xorg
An 8-year-old Xorg bug related to epoll misuse was found by a picom developer. The bug caused windows to disappear during server lock, traced to CloseDownClient events. Despite limited impact, the developer seeks alternative window tree updates, emphasizing testing and debugging tools.
The Dirty Pipe Vulnerability
The Dirty Pipe Vulnerability (CVE-2022-0847) in Linux kernel versions since 5.8 allowed unauthorized data overwriting in read-only files, fixed in versions 5.16.11, 5.15.25, and 5.10.102. Discovered through CRC errors in log files, it revealed systematic corruption linked to ZIP file headers due to a kernel bug in Linux 5.10. The bug's origin was pinpointed by replicating data transfer issues between processes using C programs, exposing the faulty commit. Changes in the pipe buffer code impacted data transfer efficiency, emphasizing the intricate nature of kernel development and software component interactions.
CVE-2021-4440: A Linux CNA Case Study
The Linux CNA mishandled CVE-2021-4440 in the 5.10 LTS kernel, causing information leakage and KASLR defeats. The issue affected Debian Bullseye and SUSE's 5.3.18 kernel, resolved in version 5.10.218.