Getting the World Record in Hatetris (2022)
David and Felipe set a world record in HATETRIS, a tough Tetris version. They used Rust, MCTS, and AlphaZero concepts to enhance gameplay, achieving a score of 66 points in 2021.
Read original articleThe article discusses the journey of David and Felipe in achieving the world record in HATETRIS, a challenging version of Tetris created by Sam Hughes. HATETRIS operates with a deterministic algorithm that selects pieces to make gameplay extremely difficult. The record progression involved human players until a breakthrough in 2021 when knewjade achieved a score of 66 points using machine learning. The team experimented with different programming languages and settled on Rust for its speed. They implemented a Monte-Carlo Tree Search (MCTS) algorithm, transitioning from tree to directed acyclic graph (DAG) searches to optimize memory usage. The article highlights the complexities and challenges faced in developing a winning strategy, emphasizing the importance of academic research and adaptation in the project. The team's efforts culminated in utilizing MCTS and AlphaZero concepts to train a model for gameplay, leading to significant advancements in their HATETRIS performance.
Related
Tetris Font (2020)
The Tetris Font, designed by Erik and Martin Demaine, features letters made of Tetris pieces, challenging users with puzzle elements. Created in 2020, it showcases the complexity of Tetris in a unique typographic experience.
Eight million pixels and counting: improving texture atlas allocation in Firefox (2021)
Improving texture atlas allocation in WebRender with the guillotiere crate reduces texture memory usage. The guillotine algorithm was replaced due to fragmentation issues, leading to a more efficient allocator. Visualizing the atlas in SVG aids debugging. Rust's simplicity and Cargo fuzz testing are praised for code development and robustness. Enhancements in draw call batching and texture upload aim to boost performance on low-end Intel GPUs by optimizing texture atlases.
Solving puzzles faster than humanly possible
The Opus Magnum challenge tasks players with automating puzzle-solving to optimize Cost, Cycles, and Area metrics. Participants submit solutions for evaluation, exploring automated vs. human strategies, hybrid approaches, scoring systems, mods, and bots.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.
Spending too much time optimizing for loops
Researcher Octave Larose shared insights on optimizing Rust interpreters, focusing on improving performance for the SOM language. By enhancing loop handling and addressing challenges, significant speedups were achieved, balancing code elegance with efficiency.
Losing the World Record in Hatetris (2023) - https://news.ycombinator.com/item?id=36558013 - July 2023 (1 comment)
Getting the World Record in Hatetris - https://news.ycombinator.com/item?id=32495842 - Aug 2022 (1 comment)
Hatetris – Tetris which always gives you the worst piece - https://news.ycombinator.com/item?id=27063894 - May 2021 (245 comments)
Hatetris - https://news.ycombinator.com/item?id=4846607 - Nov 2012 (2 comments)
Hatetris: Tetris That Hates You - https://news.ycombinator.com/item?id=1253492 - April 2010 (19 comments)
In addition to the impressive technical details, this is some really beautiful writing
I simmered on the latter overnight, and a few thoughts occurred to me:
- a parameter for fullness (holes + filled blocks that are below the "surface"). It might be its own parameter, or used to augment other parameters to discourage behavior that might lead to a the end game.
- reachable surface height: rather than taking the lowest height of the surface, compute the reachable height of the following piece
- an alternate definition of reachable surface height: the lowest point any of the seven pieces can reach (perhaps augmented by the number of pieces that can reach it)
"We’re not researchers at all. We’re just two people who became obsessed with a problem and put their meager knowledge to use, beating rocks against rocks in different configurations until something resembling a spearhead came out. ... You, too, can do something like this. Find a problem. Become obsessed with it. Learn everything you can about it. Fall down dead ends. Give up, and then keep thinking about the problem at night. Have Eureka moments that lead you down other dead ends. ... "
I thought I’d work on it further and maybe write some blog or put some dev vlogs on YouTube but never got around to doing it. Might do it some day.
Till then I’ll just post some links to my Github if anyone wants to check out what I was doing:
1. https://github.com/krkartikay/AlphaZero-proto
Sure, new methods will always replace old methods, just like CNNs replaced SIFT for image processing. However, I feel that beam search is one of those elementary methods that you'd always want to check first, similar to A* and quicksort. Even though there are fancy neural networks for game optimization, pathfinding and sorting, it's easier to get started with elementary methods because they're simple and tractable.
Its horrible, this is what hell will look like.
Forever trapped waiting for a piece that will never come - by design.
I've been doing a fair bit with Monte Carlo search lately to explore different spaces, but mine has been in the context of Llama.cpp and Magic: The Gathering. [1] I'm going to compare and contrast my approaches with the approach used by the authors.
First, I want to ask a question (either to the authors, or to the general audience), about trees vs. DAGs. The author wrote:
> It was only after the initial tree implementation that we considered that there would be a lot of repeated wells: after all, in this game you can reach the same position in a number of different ways. A lot of deliberation and a re-factored codebase later, we opted for a directed acyclic graph (DAG) instead, taking identical positions and merging them into the same entry in a graph, rather than making them distinct entries on a tree. This complicated things significantly, but reduced our memory needs by an order of magnitude, at least. Refactoring from tree searches to DAG searches was more work than we’d expected to put in to the project, but was yielding promising results.
I'm curious about why this was such a large refactor. I've made this change in two different projects now, and in both cases, I was able to change from an exhaustive tree structure to what is essentially a DAG by simply searching for duplicate states and culling any branches that are equivalent. For instance, I implemented this change in Llama.cpp and got a >400% speedup -- and it was essentially a two-line change -- no drastic refactoring needed. [2]
Am I missing something here? How is a culled tree structure fundamentally different from a "true" DAG -- and more importantly -- would I stand to gain even more performance by switching to something else? I don't see what's so complicated about this, but I feel like I must be missing something.
> we discovered that the majority of our time was now spent accessing the hash of positions, since whenever a new positions was explored, we had to check if it already existed. A bit of a further dig and we discovered that most of that time was spent running a hashing algorithm on the positions data for comparison. Which again, made sense…but we knew all our positions were unique among each other. We didn’t need to do any hashing, we could just use the position’s representation as a key directly, if we could find a reasonable way to encode it. So we replaced the hashing algorithm with our own custom version that encoded the position as a binary representation of position, rotation and piece type. This was much faster than having to hash a position, and we knew it guaranteed uniqueness. This doubled the speed of our move finder.
In the case of my Magic: The Gathering searcher, I am using a text summary of the game state (essentially `gameState.toString()`) -- what's on the battlefield, what's in the hand, what's in the graveyard, how much mana is available, etc -- as the key. I sort the cards in each location alphabetically, so that it doesn't matter which order they were put there -- only what's actually there. This GREATLY increased the speed of execution, but again -- it was just a simple change.
That said, the fire graph showing how much time is spent in string processing is eye-opening, and it makes me think that I should probably profile my Python code with a similar tool, because a binary representation might be worth it.
It's also fascinating to me that the author attempted to train an ML model to play the game. My initial thought was that an exhaustive search was the only way to approach this (because it's perhaps the only way to arrive at a max score that is provably correct), but I think I am not truly appreciating the vastness of the search space at play here, and exhaustively searching all possible inputs might just be too unimaginably large.
All that said, this read has kept me on the edge of my seat. Kudos to the developers and authors for writing this up for us -- and CONGRATULATIONS ON CAPTURING THE RECORD!!! Even though it was later surpassed, that should in no way diminish the satisfaction of the accomplishments here.
It's an absolute blast reading this. I've enjoyed it, I've learned a ton, and I'm going to try implementing some of their suggestions in my own projects!! Thank you!! It's articles like this that keep me coming back to Hacker News day after day. :)
Related
Tetris Font (2020)
The Tetris Font, designed by Erik and Martin Demaine, features letters made of Tetris pieces, challenging users with puzzle elements. Created in 2020, it showcases the complexity of Tetris in a unique typographic experience.
Eight million pixels and counting: improving texture atlas allocation in Firefox (2021)
Improving texture atlas allocation in WebRender with the guillotiere crate reduces texture memory usage. The guillotine algorithm was replaced due to fragmentation issues, leading to a more efficient allocator. Visualizing the atlas in SVG aids debugging. Rust's simplicity and Cargo fuzz testing are praised for code development and robustness. Enhancements in draw call batching and texture upload aim to boost performance on low-end Intel GPUs by optimizing texture atlases.
Solving puzzles faster than humanly possible
The Opus Magnum challenge tasks players with automating puzzle-solving to optimize Cost, Cycles, and Area metrics. Participants submit solutions for evaluation, exploring automated vs. human strategies, hybrid approaches, scoring systems, mods, and bots.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.
Spending too much time optimizing for loops
Researcher Octave Larose shared insights on optimizing Rust interpreters, focusing on improving performance for the SOM language. By enhancing loop handling and addressing challenges, significant speedups were achieved, balancing code elegance with efficiency.