August 28th, 2024

Deterministic Replay of QEMU Emulation

QEMU version 9.0.94 introduces a record/replay feature for deterministic virtual machine execution, supporting multiple hardware platforms, snapshotting, automatic data recording, and reverse debugging capabilities using GDB commands.

Read original articleLink Icon
Deterministic Replay of QEMU Emulation

QEMU version 9.0.94 includes a record/replay feature that allows for deterministic replay of virtual machine execution. This functionality records non-deterministic events, such as keyboard input and network packets, while simulating deterministic events like memory reads. The recorded execution log can be replayed multiple times on different machines, supporting various hardware platforms including x86, ARM, and PowerPC. To utilize this feature, users must run QEMU in icount mode, which enables deterministic execution. The command lines for recording and replaying differ only in the option for the rr parameter. The record/replay system also supports snapshotting, allowing users to create VM snapshots during replay for state recovery. Network interactions are managed through a replay filter, and audio and serial port data are recorded automatically. Additionally, reverse debugging capabilities are available, enabling users to step back through execution using GDB commands. This feature requires at least one snapshot to be created during the recording process. Overall, the record/replay functionality enhances debugging and testing capabilities within QEMU.

- QEMU's record/replay feature allows deterministic replay of VM execution.

- It supports various hardware platforms and requires icount mode for operation.

- Users can create snapshots during replay for state recovery.

- Network interactions and audio/serial data are automatically recorded.

- Reverse debugging is enabled through GDB commands, requiring snapshots.

Link Icon 8 comments
By @jester1337 - 5 months
I remember we were working on this exact topic at my University chair ~8-10 years ago. I think it never fully took off. Several Master students worked on it for a while. I like that it's now in QEMU!
By @waschl - 5 months
Tried to apply it for debugging on my own OS, but couldn’t get it finally running after several days of trial and error…

https://github.com/jbreu/jos?tab=readme-ov-file#reverse-debu...

By @majke - 5 months
This is a big deal. With some tooling around it can be amazing.

I can think of using this for testing, and as a vehicle to change a programming paradigm of existing/legacy software (run a thing, and roll it back aggressively from outside of a vm)

By @repelsteeltje - 5 months
I think this is awesome.

While it might seem like a small feature, it opens a huge door. It's similar to what reproducible build infrastructure has done for finding bugs, attestation that binary matches source, immutability, etc.

Can imagine this is useful for finding bugs in hardware designs too.

By @justinclift - 5 months
Anyone have clear ideas/guidelines for how much ram/disk/etc this is likely to need for a "reasonable" capture?

Say capturing a Qt application as it corrupts its internal state during startup, in order to work out what's corrupting its internal state?

By @moondev - 5 months
Such a casual and low-key introduction of what sounds like an incredible new capability.

1 . Would something like this replace packer for creating machine images?

2. Curious how quickly the replay log grows and how it compares to a CoW snapshot.

3. Will be interesting what the log looks like and what doors could open up creating or generating it by other means.

By @vessenes - 5 months
Seems to me like one of the highest and best uses of this right now would be adding verifiable builds to … literally anything. You no longer need a verifiable-build-capable compiler or language — you can just run the compile and packaging step through a deterministic QEMU session.

Does this sound right? I’m trying to figure out where uncontrollable randomness would come in during a compile phase, and coming up blank.