July 19th, 2024

Professional Poker Players Know the Optimal Strategy but Don't Always Use It

Professional poker players balance game theory optimal strategies with exploitative play to outsmart opponents. AI advancements challenge traditional approaches, pushing players to blend defensive and aggressive tactics for competitive success in evolving poker dynamics.

Read original article

Professional Poker Players Know the Optimal Strategy but Don't Always Use It

Professional poker players often know the optimal strategy, rooted in game theory, but may choose not to use it in favor of exploiting opponents' weaknesses. The concept of "game theory optimal versus exploitative play" is central in high-level poker discussions. Despite poker's basis on randomness and psychology, mathematicians have found ways to model optimal strategies, with John Nash's Nash equilibrium being a key concept. The AI revolution in poker has led to the development of superhuman algorithms that can play nearly perfectly, challenging traditional human strategies. Players now use software tools like "solvers" to study and improve their game, transitioning poker from an art to a science. While optimal play is defensive, exploitative play is more aggressive, and top players blend both strategies to stay competitive. The emergence of AI in poker has both detractors who feel it diminishes the game's magic and supporters who see it as adding a new layer of complexity. The poker landscape continues to evolve as players navigate between optimal and exploitative strategies to find success in the game.

Analysing 16,625 papers to figure out where AI is headed next (2019)

MIT Technology Review analyzed 16,625 AI papers, noting deep learning's potential decline. Trends include shifts to machine learning, neural networks' rise, and reinforcement learning growth. AI techniques cycle, with future dominance uncertain.

'Superintelligence,' Ten Years On

Nick Bostrom's book "Superintelligence" from 2014 shaped the AI alignment debate, highlighting risks of artificial superintelligence surpassing human intellect. Concerns include misalignment with human values and skepticism about AI achieving sentience. Discussions emphasize safety in AI advancement.

Prepare for AI Hackers

DEF CON 2016 hosted the Cyber Grand Challenge where AI systems autonomously hacked programs. Bruce Schneier warns of AI hackers exploiting vulnerabilities rapidly, urging institutions to adapt to AI-devised attacks efficiently.

"Superhuman" Go AIs still have trouble defending against these simple exploits

Researchers at MIT and FAR AI found vulnerabilities in top AI Go algorithms, allowing humans to defeat AI with unorthodox strategies. Efforts to improve defenses show limited success, highlighting challenges in creating robust AI systems.

Game Designers Run Your Life

Games have a long history influencing human behavior, teaching lessons, and shaping interactions with technology. Designers use rewards to control player behavior, impacting society significantly.

17 comments

By @brigadier132 - 10 months

"Game theory optimal" (as poker players like to call it) is not really the optimal strategy. It's the nash equilibrium strategy assuming the other players are also playing a nash equilibrium strategy. As soon as one player deviates from the nash equilibrium it's not optimal anymore :).

It's interesting how this affects play. If a player is bluffing slightly less than they should be, the adjustment is drastic, you should actually never call with hands that do not beat their value hands. If they are bluffing optimally, you are supposed to call using what is known as "minimum defense frequency". What's interesting about this is the minimum defense frequency is based on what the strongest hands you can possibly have in that situation are and the opponents possible hands do not even factor into it. It's required to prevent your opponent from bluffing with any two cards profitably.

To do the math, if you are on the river and the opponent bets 100 into 100, for this to be profitable for them they need to win 50% of the time or more. If your opponent is bluffing optimally, you need to call with 50% of the strongest hands you have in that specific situation (if you don't know what hands you have in a specific situation that's a problem) and sometimes they can be dogshit like King high.

But, very important to note, very few players actually bluff enough and if they bluff less than they should you should only ever call when your hand actually beats their range of possible value hands. (value vs bluff is kind of a difficult thing to communicate, generally it's value if you want your opponent to call)

Most players don't bluff enough as a result of most players calling too much! When they call too much you should obviously not bluff! This leads to very boring games of poker.

By @krukah - 10 months

One thing I haven't seen anyone mention yet is that Nash equilibria do not actually exist when you move beyond heads-up into multi-way play. There's strong empirical evidence that solving abstracted games using MCCFR and using real-time depth-limited tree search dominates humans and outperforms all other AI strategies, but these results aren't theoretically sound, unlike in heads-up play.

I've actually been working slowly on https://github.com/krukah/robopoker, an open-source Rust implementation of Pluribus, the SOTA poker AI. What I've found interesting is the difference in how I approach actually playing poker versus how I approach building a solver. Playing the game naturally consists of reasoning about narratives and incorporating information like hand history, play style, live tells. Whereas solving the game is about evaluating tradeoffs between the guarantees of imperfect-information game theory and the constraints of Texas Hold'em, finding a balance between abstract and concrete reasoning.

By @PaulRobinson - 10 months

I'm a fan of the levels of Poker thinking: https://www.blackrain79.com/2020/01/outsmart-your-opponents-...

Basically, play one level - exactly one level - beyond where you peg your opponents at.

Poker is not about playing cards. It's about playing people. Cards is just how we keep it civil and not too personal.

By @mxwsn - 10 months

Essentially all AI work I've seen in games aims for game theory optimal play, but I think it could be really interesting to consider AI for exploitative play. Does this exist? Poker with imperfect information, human pressure and fallibility means that players will inevitably stray from Nash equilibrium. The decision on how to exploit without getting exploited back oneself seems really fascinating to consider from an AI perspective. At a glance it seems to require considering how others view you..

By @r34 - 10 months

There is a huge difference between game theoretic optimal strategy and actual profitable strategy due to the human nature of the players. I imagine a professional poker player as someone who certainly knows the odds (math of the poker game in general), but is also very good in interpreting his opponents behavior (which would minimize the information revealed by them, but could they do it completely?). There are so many biases which even professional players have to overcome, that in my opinion poker is strongly psychological game. Also because of that I think that online poker and live poker are slightly different games.

By @mckn1ght - 10 months

It’s funny they mentioned Magic: the Gathering, because the first thing I thought of when reading the headline is all the conversation I see in r/EDH, and less frequently in other related subreddits, about a reluctance or even disdain for “optimal” play, which would be trying to win as efficiently as possible. To preserve the excitement and surprise of gameplay, people discourage building decks out of “staple” cards that would quickly homogenize play. “Broken” or “busted” cards that turn out to be more powerful than anticipated by the game’s designers wind up banned in sanctioned tournament play.

By @thih9 - 10 months

As a person who enjoys poker recreationally, whenever I visit these submissions I soon realize that this is a very different game from the one I’m playing.

By @mcswell - 10 months

Sort of related: I was at a restaurant last night where a sports channel was playing a poker tournament. Paint drying would have been more interesting; it was hard to tell if one of the players was even alive.

By @leesec - 10 months

Optimal in this case means least exploitable it doesn't mean most profitable, which is the real point of poker

By @ziofill - 10 months

This is analogous to what top chess players do. They try to take the opponent into suboptimal positions hoping that they are not familiar with the best lines, even though technically the position isn’t as good as the main line.

By @waprin - 10 months

I'm working on a project aiming to help pro (or serious amateur) poker players learn game theory, mostly via flashcards with spaced-repetition.

https://www.livepokertheory.com

I do personally dislike that GTO became the nomenclature , as I prefer "theory-based", since it causes this confusion, but trying to fight it at this point is hopeless because GTO is the search term people are using. And when people say they "play GTO" they usually mean "equilibrium" rather than "optimal against my specific opponents" which is "exploitative".

If you actually watch what the top players advocate for, everyone suggests you want to play exploitatively. However, there's one equilibrium solution and effectively infinite exploitative solutions, so equilibrium is a reasonble starting point to develop a baseline understanding of the mechanics of the game. It's tough to know how much "too much" bluffing is unless you know a baseline.

Furthermore, if you "exploit" people by definition you are opening yourself up to being exploited so you need to be very careful your assumptions are true.

Also, with solvers like piosolver, you can "node lock" (tell a node in the game tree to play like your opponent, rather than an equilibrium way plays), but there's many pitfalls, such as the solver adjusting in very unnatural ways on other nodes to adjust, and it being impractical to "lock" a strategy every node in the tree. There's new ideas called "incentives" which gives the solver an "incentive" to play more like a human would (e.g. calling too much) but these are new ideas still being actively explored.

Rock paper scissors is frequently used to explain GTO but it's not the best example because equilibrium in rock paper scissors will break even against all opponents, but equilibrium poker strategy will actually beat most human poker players, albeit not as much as a maximally exploitative one.

There's two other huge pieces this article glosses over:

1) It's as impossible for a human to play like a computer in poker as in chess - in fact far more impossible, because in poker you need to implement mixed strategies. In chess there's usually a best move, but in poker the optimal solution often involves doing something 30% of the time and something else 70% of the time. The problem is that, not only are there too many situations to memorize all the solutions, but actually implementing the correct frequencies is impossible for a human. Some players like to use "randomizers" like dice at the table, or looking at a clock, but I find that somewhat silly since it still so unlikely you are anywhere near equilbrium.

2) Reading someone's "tells" live is still a thing. While solvers have led to online poker to decline due to widespread "real time assistance", live poker is booming (the 2024 World Series of Poker Main Event just broke the record yet again) , and in person in live poker, people still give off various information about their hand via body language. From the 70s to the early 2000s, people were somewhat obsessed with "tells" as a way to win at poker. Since computers have advanced so much, it's fallen out of favor, but the truth is, both are useful. It's totally mistaken to think that advancement in poker AI , GTO , and solvers have rendered live reads obsolete. In fact, in 2023, Tom Dwan won the biggest pot in televised poker history (3.1 million) and credited a live read to his decision, in a spot where the solver would randomize between a call and a fold.

By @bfeist - 10 months

https://archive.is/2024.07.20-035804/https://www.scientifica...

By @kkwteh - 10 months

Watching Doug Polk discuss hands on YouTube really opened my eyes as to how professional pokers think about hands.

By @xyst - 10 months

I vaguely recall a poker tournament where a player employed this exact strategy and it worked out for him. I think it was the circumstances and just a bit of luck that allowed him to advance to later stages.

By @kruhft - 10 months

Is that not called 'bluffing' with strategic planning?