August 11th, 2024

Generating Simpson's Paradox with Z3

The article illustrates Simpson's Paradox with two baseball players, showing Player A's higher overall average despite Player B outperforming A against both pitcher types, highlighting statistical interpretation complexities.

Read original articleLink Icon
ConfusionAppreciationSkepticism
Generating Simpson's Paradox with Z3

The article discusses an example of Simpson's Paradox using a scenario involving two baseball players, A and B. Player A has a higher overall batting average than player B, yet player B outperforms A against both left-handed and right-handed pitchers. This paradox is illustrated through the use of the Z3 Theorem Prover, which helps generate a model that satisfies the conditions of the problem. The model shows that while A has an overall batting average of approximately 0.235, B's overall average is about 0.231. However, B has a batting average of 0.5 against left-handed pitchers and 0.182 against right-handed pitchers, compared to A's averages of 0.4 and 0.167, respectively. The key takeaway is that the players faced different numbers of pitchers, which contributes to the paradoxical results. The article concludes with a tabular representation of the batting averages for clarity.

- Player A has a higher overall batting average than player B.

- Player B has better averages against both left-handed and right-handed pitchers.

- The Z3 Theorem Prover is used to illustrate the paradox.

- The players faced different sets of pitchers, affecting their averages.

- The example highlights the complexities of statistical interpretation in sports.

AI: What people are saying
The discussion surrounding Simpson's Paradox in the article elicits various perspectives and insights from commenters.
  • Many commenters emphasize the complexity and potential misleading nature of statistical interpretations, particularly with varying levels of granularity.
  • Several users express dissatisfaction with the baseball example used to illustrate the paradox, suggesting it may not be relatable to all audiences.
  • There is a consensus on the importance of visualizing data to better understand statistical phenomena, with references to related concepts like Anscombe's Quartet.
  • Some commenters highlight the need for multifactor analysis to avoid misinterpretations that can arise from examining single variables.
  • Questions about causation and the appropriateness of the term "paradox" are raised, indicating a desire for deeper exploration of the topic.
Link Icon 13 comments
By @Izkata - 7 months
This visualization on Wikipedia was what I needed to understand Simpson's Paradox, the descriptions never made a whole lot of sense to me until seeing it like this: https://en.wikipedia.org/wiki/Simpson%27s_paradox#/media/Fil...

Along the same lines of "visualize your data to see what's really going on" is Anscombe's Quartet: https://en.wikipedia.org/w/index.php?title=Anscombe%27s_quar...

And then there's the Datasaurus [Dozen], which has some fun with the idea behind Anscombe's Quartet: https://en.wikipedia.org/wiki/Datasaurus_dozen (you can see it animated here: https://blog.revolutionanalytics.com/2017/05/the-datasaurus-... )

By @TheMrZZ - 7 months
Biggest trap of Simpson's paradox is the results can change with every level of granularity.

If you take the example of Treatment A vs Treatment B for tumors, you can get infinite layers of seemingly contradicting statemens: - Overall, Treatment A has better average results - But if you add tumor size, Treatment B is always better - But if you add gender to size, Treatment B is always better - But if you add age category to gender and size, Treatment A is always better - etc...

It totally contradicts our instincts, and shows statistics can be profoundly misleading (intentionally or not).

By @incognito124 - 7 months
I just love the napkin equation in the middle of [1], it really made it clear to me

[1]: https://robertheaton.com/2019/02/24/making-peace-with-simpso...

By @kqr - 6 months
Popular wisdom regarding experimentation has always been to "vary just one thing at a time, keeping the others as constant as possible". Fisher argued to the contrary, that we should (systematically) try as many variations as possible simultaneously. Simpson's paradox (and perhaps the similarly counter-intuitive Berkson's paradox) are the reason why: when analysing just one variate at a time, we risk seeing relationships that aren't there, or run counter to what we are trying.

Proper multifactor analysis that accounts for all variations simultaneously is required to learn about complex phenomena.

By @staplung - 7 months
Simpson's Paradox keeps experimenters up at night because it embodies the idea that although your data might say one thing, it's always possible that slicing the data via some unknown axis of finer granularity might paint a very different picture. It's hard to know if there is such an axis lurking there in your data, let alone, what it might be.

If you get paranoid about its presence it can lead you to second guess pretty much every statistic. "I know that 4 out of 5 dentists recommend chewing X Brand gum but what if I slice the dentists by number of eyes? Maybe both one-eyed dentists and two-eyed dentists aren't so enthusiastic."

By @tombert - 7 months
Z3 is kind of my new favorite thing right now. I have a problem that lends itself quite well to constraints-based reasoning, and I need it to be optimized. I'm sure I could have hacked something together using any number of programming languages, but after playing with Z3 for a bit, I realized that this could be easily done in around ~100 lines of an SMT2 file, and probably be considerably faster.

Tools like this make me feel a lot better about all the time I wasted playing with predicate logic.

By @staplung - 7 months
I know essentially nothing about Z3 but it seems like there's a potential problem in the code. There's a section headed by the comment "All hits and miss counts must be positive" followed by a bunch of assertions that those numbers are greater than 0. Isn't it possible that you have 0 hits or misses? I mean, what if your season is short and you only face a couple of lefty pitchers and strike out both times?

In any case, I would explain the paradox differently than the author. The author says: "The key to understanding the paradox is that the players did not bat against the same set of pitchers. A batted against 5 lefties and 12 righties; B against 2 and 11."

I would say instead that the key to understanding the paradox is to observe that both players are much better when batting against lefties and that player A batted against lefties much more often, both in absolute and relative terms. In other words, A is not as good against lefties as B but he faced a lot more of these comparatively easy pitchers.

By @cubefox - 6 months
A basic question I always had about Simpson's paradox: If X is positively correlated with Y, but X is also negatively correlated with its parts when Y is broken down (Simpson's paradox) – is it then more likely that X causes Y or that X causes not-Y?

This seems to be a pretty fundamental question but I have never seen it addressed.

By @hackandthink - 7 months
"The key to understanding the paradox is that the players did not bat against the same set of pitchers."

This is misleading. I am baseball ignorant but I feel this is a contrived and bad example for Simpson's paradox.

(UC Berkeley gender bias example is much better)

By @taeric - 6 months
I am not sure I agree with that conclusion. In this particular example, it is almost certainly more important to consider frequency and sample size? Could have been the exact same pitchers, per se.
By @zekrioca - 7 months
I dislike the baseball example, because it is too context specific to the other several Bi people who don’t follow the sport.
By @clircle - 7 months
Seems like an inappropriate use of the word "paradox". How about Simpson's intuitive situation?