October 16th, 2024

Be Suspicious of Success

The article emphasizes skepticism towards successful software due to potential hidden bugs, advocating for comprehensive testing of both successful and error scenarios, and recommending resources for further learning in algorithms.

Read original article

SkepticismConcernFrustration

The article discusses the principle of "Be Suspicious of Success" (BSOS) in software development, emphasizing that successful software often contains hidden bugs. It references Leslie Lamport's insights on model checking, suggesting that a lack of errors in verification processes should raise suspicion. The author argues that code may appear to work for the wrong reasons, leading to potential failures in the future. Verification methods, while useful, cannot fully explain why code succeeds, making it essential to adopt practices like test-driven development and the "make it work, make it break" approach. The article also highlights the importance of testing both "happy paths" (successful scenarios) and "sad paths" (error handling scenarios), noting that many failures in distributed systems stem from trivial mistakes in error handling. The author concludes by recommending a blog focused on computer science algorithms, which provides valuable insights for programmers.

- Successful software may be buggy and should be approached with skepticism.

- Verification methods can indicate success but do not explain it.

- Testing should encompass both happy and sad paths to ensure robustness.

- Many system failures arise from errors in error handling mechanisms.

- The article recommends resources for further learning in computer science algorithms.

Programmers Should Never Trust Anyone, Not Even Themselves

Programmers are warned to stay cautious and skeptical in software development. Abstractions simplify but can fail, requiring verification and testing to mitigate risks and improve coding reliability and skills.

On Building Systems That Will Fail (1991)

The Turing Lecture Paper by Fernando J. Corbató discusses the inevitability of failures in ambitious systems, citing examples and challenges in handling mistakes. It highlights the impact of continuous change in the computer field.

You've only added two lines – why did that take two days

The article highlights that in software development, the number of lines of code does not reflect effort. Effective bug fixing requires thorough investigation, understanding context, and proper testing to prevent recurring issues.

In the Labyrinth of Unknown Unknowns

The article highlights challenges in software testing, particularly "unknown unknowns," advocating for Property-Based Testing and advanced platforms to autonomously identify bugs and improve testing efficiency, preventing software failures.

Practices of Reliable Software Design

The article outlines eight practices for reliable software design, emphasizing off-the-shelf solutions, cost-effectiveness, quick production deployment, simple data structures, and performance monitoring to enhance efficiency and reliability.

AI: What people are saying

The comments reflect a range of perspectives on software testing and the reliability of successful software.

Many commenters emphasize the importance of thorough testing, including edge cases and mutation testing, to ensure software reliability.
There is skepticism about the notion that software is "good" simply because it works, with concerns about hidden bugs and the implications of relying on software that may not be thoroughly vetted.
Some discuss the role of code coverage tools and test-driven development as essential practices in improving software quality.
Several comments highlight the tension between software marketing and actual reliability, suggesting that popular software may often be more buggy due to prioritization of features over thorough testing.
There is a recognition that testing can only show the presence of bugs, not their absence, and that static analysis and type systems may play a crucial role in future software development.

18 comments

By @peterldowns - 6 months

Haven't seen it mentioned here in the comments so I'll throw in — this is one of the best uses for code coverage tooling. When I'm trying to make sure something really works, I'll start with a failing testcase, get it passing, and then also use coverage to make sure that the testcase is actually exercising the logic I expect. I'll also use the coverage measured when running the entire suite to make sure that I'm hitting all the corner cases or edges that I thought I was hitting.

I never measure coverage percentage as a goal, I don't even bother turning it on in CI, but I do use it locally as part of my regular debugging and hardening workflow. Strongly recommend doing this if you haven't before.

I'm spoiled in that the golang+vscode integration works really well and can highlight executed code in my editor in a fast cycle; if you're using different tools, it might be harder to try out and benefit from it.

By @maxbond - 6 months

> It's not enough for a program to work, it has to work for the right reasons. Code working for the wrong reasons is code that's going to break when you least expect it.

This reminds me of the recent discussion of gettiers[1]. That article focused on Gettier bugs, but this passage discusses what you might call Gettier features.

Something that's gotten me before is Python's willingness to interpret a comma as a tuple. So instead of:

    my_event.set()

I wrote:

    my_event,set()

Which was syntactically correct, equivalent to:

    _ = (my_event, set())

The auto formatter does insert a space though, which helps. Maybe it could be made to transform it as I did above, that would make it screamingly obvious.

[1a] https://jsomers.net/blog/gettiers

[1b] https://news.ycombinator.com/item?id=41840390

By @BerislavLopac - 6 months

> How do I know whether my tests are passing because they're properly testing correct code or because they're failing to test incorrect code?

One mechanism to verify that is by running a mutation testing [0] tool. They are available for many languages; mutmut [1] is a great example for Python.

[0] https://en.wikipedia.org/wiki/Mutation_testing

[1] https://mutmut.readthedocs.io

By @dan-robertson - 6 months

One general way I like to think about this is that most software you use has passed through some filter – it needed to be complete enough for people to use it, people needed to find it somehow (eg through marketing), etc. If you have some fixed amount of resources to spend on making that software, there is a point where investing more of them in reducing bugs harms one’s chances of passing the filter more than it helps. In particularly competitive markets you are likely to find that the most popular software is relatively buggy (because it won by spending more on marketing or features) and you are often more likely to be using that software (for eg interoperability reasons) too.

By @lo_zamoyski - 6 months

As the Dijkstrian expression goes, testing shows the presence of bugs, not their absence. Unit tests can show that a bug exists, but it cannot show you that there are no bugs, save for the particular cases tested and even then, only in a behaviorist sort of way (meaning, a your buggy code may still produce the expected output for tested cases). For that, you need to be able to prove your code possesses certain properties.

Type systems and various forms of static analysis are going to increasingly shape the future of software development, I think. Large software systems especially become practically impossible to work with and impossible to verify and test without types.

By @foobar8495345 - 6 months

In my regressions, i make sure i include an "always fail" test, to make sure the test infrastructure is capable of correctly flagging it.

By @eternityforest - 6 months

I love unit tests, but I sometimes additionally manually step through code in the debugger, looking for anything out of place. If a variable does anything surprising then I know I don't understand what I just wrote.

By @RangerScience - 6 months

Colleagues: If the code works, it’s good!

Me: Hmmm.

Managers, a week later: We’re starting everyone on a 50% on-call rotation because there’s so many bugs that the business is on fire.

Anyway, now I get upset and ask them to define “works”, which… they haven’t been able to do yet.

By @kazinator - 6 months

> If the code still works even after the change, my model of the code is wrong and it was succeeding for the wrong reasons.

If someone else wrote the code, your model of why it works being wrong doesn't mean anything is wrong other than your understanding.

Sometimes even if you wrote something that works and your own model is wrong, you don't necessarily have to fix anything: just learn the real reason the code works, go "oh", and leave it. :) (Revise some documentation and write some tests based on the new understanding.)

By @JohnMakin - 6 months

There are few things that terrify me more nowadays at this point in my career than spending a lot of time writing something and setting it up, only to turn it on for the first time and it works without any issues.

By @krackers - 6 months

Is this not common practice? I'd expect a good engineer who cares about their work to be just as suspicious if something works _when it shouldn't_ as when something doesn't work when it should. Both indicate a mismatch between your mental model and what it's actually doing.

By @shahzaibmushtaq - 6 months

The author is simply talking about the most common testing types[0] but in a more philosophical way.

[0] https://www.perfecto.io/resources/types-of-testing

By @RajT88 - 6 months

I had a customer complain once about how great Akamai WAF was, because it never had false positives. (My company's WAF solution had many)

Is that actually desirable? This article articulates my exact gut feeling.

By @teddyh - 6 months

> This is why test-driven development gurus tell people to write a failing test first.

To be precise, it’s one of the big reasons, but it’s far from the only reason to write the test first.

By @computersuck - 6 months

Website not quite loading.. HN hug of death?

By @sciencesama - 6 months

Nvidea ?

By @kellymore - 6 months

Site is buggy

Be Suspicious of Success

Related

Programmers Should Never Trust Anyone, Not Even Themselves

On Building Systems That Will Fail (1991)

You've only added two lines – why did that take two days

In the Labyrinth of Unknown Unknowns

Practices of Reliable Software Design

Related

Programmers Should Never Trust Anyone, Not Even Themselves

On Building Systems That Will Fail (1991)

You've only added two lines – why did that take two days

In the Labyrinth of Unknown Unknowns

Practices of Reliable Software Design