Real-time audio programming 101: time waits for nothing
Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.
Read original articleReal-time audio programming requires adherence to specific principles to avoid glitches in audio output. Writing real-time audio software for general-purpose operating systems involves ensuring that tasks like disk access or thread synchronization do not block the audio processing thread. The code must be suitable for real-time execution without calling functions that might block for a long time. Digital audio works by playing a constant stream of samples at a fixed rate, and any delay in providing these samples can lead to glitches. Buffer sizes of 1-5ms are considered normal for low-latency audio software. Glitches can occur if code execution time exceeds the buffer period, leading to unpredictable audio behavior. Avoiding blocking operations, inefficient algorithms, and locking mechanisms within the audio callback is crucial to maintaining real-time performance and preventing glitches. Prioritizing worst-case execution time and understanding the temporal behavior of the code are essential in real-time audio programming to ensure stable audio output.
Related
Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code
A developer fixed a seven-year-old bug in an iPad accessory causing missed MIDI messages by optimizing a modulo operation. The bug's resolution improved the audio processor's efficiency significantly.
Four lines of code it was four lines of code
The programmer resolved a CPU utilization issue by removing unnecessary Unix domain socket code from a TCP and TLS service handler. This debugging process emphasized meticulous code review and system interaction understanding.
C++ patterns for low-latency applications including high-frequency trading
Research paper explores C++ Design Patterns for low-latency applications, focusing on high-frequency trading. Introduces Low-Latency Programming Repository, optimizes trading strategy, and implements Disruptor pattern for performance gains. Aimed at enhancing latency-sensitive applications.
One practical reality it doesn't share is that your audio processing (or generation) code is often going to be running in a bus shared by a ton of other modules and so you don't have the luxury of using "5.6ms" as your deadline for a 5.6ms buffer. Your responsibility, often, is to just get as performant as reasonably possible so that everything on the bus can be processed in those 5.6ms. The pressure is usually much higher than the buffer length suggests.
the cpal library in Rust is excellent for developing cross-platform desktop applications. I'm currently maintaining this library:
https://github.com/chaosprint/asak
It's a cross-platform audio recording/playback CLI tool with TUI. The source code is very simple to read. PRs are welcomed and I really hope Linux users can help to test and review new PRs :)
When developing Glicol(https://glicol.org), I documented my experience of "fighting" with real-time audio in the browser in this paper:
https://webaudioconf.com/_data/papers/pdf/2021/2021_8.pdf
Throughout the process, Paul Adenot's work was immensely helpful. I highly recommend his blog:
https://blog.paul.cx/post/profiling-firefox-real-time-media-...
I am currently writing a wasm audio module system, and hope to publish it here soon.
If your tempo drifts, then you're not going to hear the rhythm correctly. If you have a bit of latency on your instrument, it's like turning on a delay pedal where the only signal coming through is the delay.
One might assume if you just follow audio programming guides then you can do all this, but you still need to have your system setup to handle real time audio, in addition to your program.
It's all noticeable.
It's worth noting that these are practically the only case where extreme real-time audio programming measures are necessary.
If you're making, for example, a video game the requirements aren't actually that steep. You can trivially trade latency for consistency. You don't need to do all your audio processing inside a 5ms window. You need to provide an audio buffer every 5 milliseconds. You can easily queue up N buffers to smooth out any variance.
Highly optimized competitive video games average like ~100ms of audio latency [1]. Some slightly better. Some in the 150ms and even 200ms range. Input latency is hyper optimized, but people rarely pay attention to audio latency. My testing indicates that ~50ms is sufficient.
Audio programming is fun. But you can inject latency to smooth out jitter in almost all use cases that don't involve a live musical instrument.
I would love to see a UI system that has predictable low-latency real-time perf, so you could confidently achieve something like single frame latency on 144Hz display.
Related
Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code
A developer fixed a seven-year-old bug in an iPad accessory causing missed MIDI messages by optimizing a modulo operation. The bug's resolution improved the audio processor's efficiency significantly.
Four lines of code it was four lines of code
The programmer resolved a CPU utilization issue by removing unnecessary Unix domain socket code from a TCP and TLS service handler. This debugging process emphasized meticulous code review and system interaction understanding.
C++ patterns for low-latency applications including high-frequency trading
Research paper explores C++ Design Patterns for low-latency applications, focusing on high-frequency trading. Introduces Low-Latency Programming Repository, optimizes trading strategy, and implements Disruptor pattern for performance gains. Aimed at enhancing latency-sensitive applications.