The Linux audio stack demystified (and more)
Digital audio processing in Linux is complex, involving ALSA, JACK, PulseAudio, and PipeWire. Understanding requires knowledge of sound basics, human perception, and digital workings, including sampling and quantization.
Read original articleDigital audio processing in Linux involves a complex audio stack comprising ALSA, JACK, PulseAudio, and PipeWire. Understanding this system requires knowledge of sound basics, human sound perception, and digital audio workings. Sound is a vibration propagating as an acoustic wave through a medium like air, with key properties being period and amplitude. Human hearing ranges from 20 Hz to 20 kHz, with sensitivity between 2,000 Hz and 5,000 Hz. Digital audio involves sampling analog sound waves into digital data, with the Nyquist-Shannon theorem dictating the minimum sampling rate for accurate reconstruction. Quantization limits precision in digital systems, with bit depth determining the number of discrete levels available to represent sound amplitude. Storing digital audio involves representing samples in a structured format like CSV. This overview aims to demystify the Linux audio stack, shedding light on its components and interactions for a comprehensive understanding.
Related
Real-time audio programming 101: time waits for nothing
Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.
Xwax Is an Open-Source Digital Vinyl System (DVS) for Linux
xwax is an open-source Digital Vinyl System for Linux, enabling DJs to play digital audio through turntables. It supports various file formats and features like needle drops and scratching. Updates enhance audio handling and compatibility. The project integrates with Raspberry Pi for DJ turntable use.
First, in the article, [1] shows in one single diagram where the complexity is coming from; the audio system has to handle a good deal of different hardware on many different systems and also provide extra functionality for multiplexing, network features, wireless headsets and their codecs, etc. All this: open source.
Second: linux is the only platform where everything works right now flawlessly for me: My Bose joins without any problems, switches to headset mode during zoom calls, switches back to high definition audio otherwise. I can select a different sink, even networked, whenever I want the output to appear on a different networked device. MacOS sometimes needs a reboot so that the bluetooth subsystem works, what the hell.
And all this worked with PulseAudio, and now works with Pipewire, which is an even higher quality iteration of PA.
I don't complain. I wish MacOS/Windows had such a versatile, configurable, but sanely-working-out-of-the-box audio system as an off-the-shelf Fedora, or even freaking Arch linux has.
HTH
[1]: https://blog.rtrace.io/images/linux-audio-stack-demystified/...
every other layer is a coping mechanism and the plurality and divergence of the FOSS community responds in various ways: - Jack - PulseAudio - PipeWire
I am unclear why Jaroslav Kyocera chose to make ALSA single-client, but Apples CoreAudio multi-client driver model is the right way to do digital audio on general-purpose computing devices running multi-tasking OS'es on application processors, in my opinion.
Current issues this article does not address that actually constitute large parts of the "mess" of Linux Audio:
- channel mapping that is not transparent nor clearly assigned anywhere in userspace. (aka, why does my computer insist that my multi-input pro-audio interface is a surround-sound interface? I don't WANT high-pass-filters on the primary L/R pair of channels. I am not USING a subwoofer. WTF)
- the lack of a STANDARD for channel-mapping, vs the Alsa config standards, /etc/asound.conf etc.
- the lack of friendly nomenclature on hardware inputs/outputs for DAW software, whether on the ALSA layer, or some sound-server layer. (not to mention that ALSA calls an 8-channel audio-interface "4 stereo devices")
- probably more, but I can't remember. My current audio production systems have the DAW software directly opening an ALSA device. I cannot listen to audio elsewhere until I quit my DAW. This works and I can set my latency as low as the hardware will allow it.
this is the thing: more than about 10ms latency is unacceptable for audio recording in the multitrack fashion, as one does.
1. I have to modify my audio settings every time I start a call in Teams on Linux because it keeps losing my audio device.
2. In my audio settings UI, half the time I switch my devices the speaker test doesn't work.
3. In my audio settings UI, whenever I switch my mic I hear myself. The mic feedback only disappears 30 seconds after I close the settings UI.
4. My work headsets have a robotic sound (likely caused by an incorrect bitrate or buffer size). I can only use work bluetooth headsets via their dedicated dongle.
This was my default experience on a popular debian based distro. And it mirrors the general experience I see online. Things are unstable and a mess.
I started reading this article and it's embelished with phrases like: "is a professional-grade audio server", "widely used in professional audio production environments", and general language that sounds like a sales pitch. This does not fit with anything I'm familiar with.
I would have preferred a neutral and semi technical approach, with 10% of the buzzwords. As written, I trust nothing.
Fact is, RT audio is hard, and the peoplebehind JACK have cared for the underlying problems for a long time already.
Maybe PipeWire, but to be honest, it reminds me too much of PA.
I guess I will stay with plain JACK and SuperCollider as my toolbelt, and not care about PA or PW. Like the grumpy old hacker I am.
> ...
> Ulike PuleAudio and JACK, PipeWire does not require ALSA on a system, in fact if ALSA is installed the output of ALSA is very likely pushed through PipeWire
I don't get this part. If ALSA represents the kernel level hardware drivers for audio, how does Pipewire bypass it? Does it implement an alternative set of kernel drivers? I assumed Pipewire still relies on ALSA base.
Also, given that Pulseadio and Pipewire both support ALSA clients, does it mean that the preferred API for applications should be ALSA? This way they can play sound on any system, even where there is no audio daemon.
(Bitwig, Ardour, Reaper, more? I would like to see "Input 1" or "Channel 1" and not some strange ciphers when trying to assign things in a little dropdown selector in a DAW)
That does not instill confidence in that they have any idea what they are doing.
I remember that as far as on 2012 some Linux game ports from Steam (before Proton was a thing) failed to play sound.
Other than that, it took them a decade to figure out that sound should switch to HDMI when it is plugged in. It may still require arcane config changes and may break down.
I've opened the post to read about pipewire, and it seems that clicking an anchor does nothing. So it's not only sound they can't get right.
Not sure how well the NI Native Access crap is working on WINE either. Does anyone know?
The year of the Linux desktop never arrived but most of the world has Linux sitting in our palms. Let's build on that instead of the dead ends of Debian, Slackware, etc.
Related
Real-time audio programming 101: time waits for nothing
Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.
Xwax Is an Open-Source Digital Vinyl System (DVS) for Linux
xwax is an open-source Digital Vinyl System for Linux, enabling DJs to play digital audio through turntables. It supports various file formats and features like needle drops and scratching. Updates enhance audio handling and compatibility. The project integrates with Raspberry Pi for DJ turntable use.