June 19th, 2024

Chasing a Bug in a SAT Solver

Adolfo Ochagavía and Prefix.dev swiftly resolved a bug in the SAT-based dependency solver, resolvo, with community input. The incident emphasizes open-source collaboration and potential debugging tool enhancements for software quality.

Read original article

Adolfo Ochagavía collaborated with Prefix.dev to address a bug in their SAT-based dependency solver, resolvo. Despite initial concerns, the bug was identified and fixed within a few hours, thanks to a minimal reproducible example provided by a community member on GitHub. This example streamlined the debugging process by simplifying the solver logs, enabling a swift resolution. The incident highlights the collaborative nature of open source development, where community contributions play a crucial role in enhancing software quality. Ochagavía also reflects on potential improvements in debugging tools, such as recording and replaying the solving process or implementing a shrinking algorithm for simplifying inputs. Despite these ideas, the solver's core logic is considered stable, reducing the likelihood of future bugs. The experience underscores the importance of community engagement and knowledge sharing in software development, showcasing the power of collective problem-solving in the open-source ecosystem.

Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code

A developer fixed a seven-year-old bug in an iPad accessory causing missed MIDI messages by optimizing a modulo operation. The bug's resolution improved the audio processor's efficiency significantly.

Solving puzzles faster than humanly possible

The Opus Magnum challenge tasks players with automating puzzle-solving to optimize Cost, Cycles, and Area metrics. Participants submit solutions for evaluation, exploring automated vs. human strategies, hybrid approaches, scoring systems, mods, and bots.

Software Engineering Practices (2022)

Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.

I found an 8 years old bug in Xorg

An 8-year-old Xorg bug related to epoll misuse was found by a picom developer. The bug caused windows to disappear during server lock, traced to CloseDownClient events. Despite limited impact, the developer seeks alternative window tree updates, emphasizing testing and debugging tools.

Getting 100% code coverage doesn't eliminate bugs

Achieving 100% code coverage doesn't ensure bug-free software. A blog post illustrates this with a critical bug missed despite full coverage, leading to a rocket explosion. It suggests alternative approaches and a 20% coverage minimum.

9 comments

By @rgovostes - 10 months

If you have a large input that triggers a bug and you want to reduce it, it might be worth looking into the Delta Debugging[0] algorithm used by C-Reduce[1] to make minimal test cases for C-like languages. C-Reduce "happens to do a pretty good job reducing the size of programs in languages other than C/C++, such as JavaScript and Rust" so it might work out of the box for you. (Aside: there's also ddSMT for SMT solvers that take SMT-LIB2 input.)

0: https://www.st.cs.uni-saarland.de/dd/

1: https://github.com/csmith-project/creduce

By @JonChesterfield - 10 months

I wrote a CSP solver a while ago. The bugs were alarmingly prone to being incomplete enumeration of solutions. The solver would say "Yep, here's all 2412 solutions to that problem", but there were more solutions out there it missed. I didn't usually know how many solutions there would be so it's hard to distinguish between "all of them" and "95% of them". However that also threw a lot of doubt on when the solver returned unsat - was the problem unsat, or was the solver failing to look in part of the search space?

I didn't find a plausible testing strategy for convincing myself that the last of that class of error was gone and ultimately stopped working on it as a result.

By @droelf - 10 months

Resolvo (the SAT solver here) has been really good for us. It helped make some conda-forge bots up to 10x faster than the previous C-based solver (libsolv) while being memory safe.

The specific bot tests went from taking 60 minutes to ~6 minutes which is quite remarkable.

By @kragen - 10 months

i've used drmaciver's property testing library hypothesis to find bugs in c libraries before; it has shrinking. 'doesn't crash' is the easiest property to define, but a bit fiddly to connect to property testing libraries, because you have to spawn a subprocess. in one case i just used fork, _exit, and wait, but in some cases things get ugly then (posix threads guarantee you almost nothing, but fortunately i wasn't using threads)

his 'minithesis' is the best way i've found to learn about implementing property-based testing

sat and smt solvers have gotten astoundingly good, and most of the leading ones are free software. smt-comp shows off new gains every year and has a lineup of the leaders

z3 is a particularly easy smt solver to get started with, in part because of its convenient python binding

a thing sat/smt and property-based testing have in common is that they both solve inverse problems, allowing you to program at a more declarative level

By @pfdietz - 10 months

I don't see it mentioned here, but SAT solvers are also good targets for property based testing, providing failure cases for subsequent minimization. There are many properties that can be checked, for example that satisfaction-preserving transformations on an input do not alter the answer the solver returns.

By @rurban - 10 months

Ah, counting! One of the hardest problems, that's why I suck at probability and stats. Knuths Volume 4

By @stevemk14ebr - 10 months

this article has almost zero content on the bug, the debugging, or the fix. I expect more details

By @babel_ - 10 months

Having ended up with a critical bug in the SAT solver I wrote for my undergrad thesis, it really can be a challenge to fix without clear logs. So, always nice to see a little love for contribution through issues and finding minimal ways to reproduce edge cases.

While we do mention how good issue contributions are significant and meaningful, we often forget how there's often more to it than an initial filing, and may overlook the contributions from those that join lengthier issue threads later.

(Oh, and yes, that critical bug did impact the undergrad thesis, but it could be worked around, however meant I couldn't show the full benefits of the solver.)

By @alvincodes - 10 months

I thought this was about SAT physics (used in games, etc)

Chasing a Bug in a SAT Solver

Related

Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code

Solving puzzles faster than humanly possible

Software Engineering Practices (2022)

I found an 8 years old bug in Xorg

Getting 100% code coverage doesn't eliminate bugs

Related

Spending 3 months investigating a 7-year old bug and fixing it in 1 line of code

Solving puzzles faster than humanly possible

Software Engineering Practices (2022)

I found an 8 years old bug in Xorg

Getting 100% code coverage doesn't eliminate bugs