July 2nd, 2024

The History of Machine Learning in Trackmania

Trackmania, a racing game, inspires machine learning projects aiming for superhuman performance. Players like Rottaca and TMRL use supervised and unsupervised learning. Innovations include track lookahead and diverse learning algorithms. Programs show progress but not human-level skills. The goal is a top-performing AI on a desktop PC, blending gaming and machine learning for innovation.

Read original articleLink Icon
The History of Machine Learning in Trackmania

Trackmania, a competitive racing game, has sparked interest in machine learning applications. The goal is to train a neural network to achieve superhuman performance in the game's Cup of the Day event. Various players and groups have developed autonomous programs to play Trackmania, such as Rottaca's supervised learning approach and TMRL's unsupervised learning pipeline. Laurens Neinders improved performance by incorporating track curvature lookahead, while AndrejGobeX experimented with different learning algorithms. Results varied, with some programs showing promising advancements but still falling short of human-level performance. The ultimate aim is to create a program that can excel in Trackmania using machine learning techniques on a single desktop PC. The intersection of gaming and machine learning presents exciting challenges and opportunities for innovation in the field.

Link Icon 10 comments
By @donadigo - 6 months
This is an awesome overview and if you want more, most of those are documented in an approachable way on YouTube.

Just wanted to provide some perspective here on how many things those projects need to take care of in order to get some training setup going.

I'm the developer behind TMInterface [1] mentioned in this post, which is a TAS tool for the older TrackMania game (Nations Forever). For Linesight (last project in this post), I recently ended up working with its developers to provide them the APIs they need to access from the game. There's a lot of things RL projects usually want to do: speed up the game (one of the most important), deterministically control the vehicle, get simulation information, navigate menus, skip cut scenes, make save states, capture screenshots etc. Having each of those things implemented natively greatly impacts the stability and performance of training/inference in a RL agent, e.g. for the latest version the project uses a direct capture of the surface that's rendered to the game window, instead of using an external Python library (DxCam). This is faster, doesn't require any additional setup and also allows for training even if the game window is completely occluded by other windows.

There are also many other smaller annoying things: many games throttle FPS if the window is unfocused which is also the case here, and the tool patches out this behaviour for the project, and there's a lot more things like this. The newest release of Linesight V3 [2] can reliably approach world records and it's being trained & experimented with by quite a few people. The developers made it easy to setup and documented a lot of the process [3].

[1] https://donadigo.com/tminterface/

[2] https://youtu.be/cUojVsCJ51I

[3] https://linesight-rl.github.io/linesight/build/html/

By @squigz - 6 months
I just got into Trackmania recently. Very difficult game, especially on a keyboard, but fun! It's crazy to see how dedicated the pros are. I got into it after watching the streamer Wirtual try and beat the hardest map the game has seen (Deep Dip 2), for a prizepool of something like $30,000. It's an insanely hard tower climb map, where if you fall, you have to start completely over. A 1-2 hour run could just disappear. Anyway, over just a few weeks, Wirtual put several hundred hours into the map, with over 1,500 falls... and then gave up, understandably :P
By @Macuyiko - 6 months
I follow RL from the sides (I have dabbled with it myself), and have seen some of the cool videos the article also lists. I think one of the key points (and a bit of a personal nitpick) the article makes is this:

> Thus far, every attempt at training a Trackmania-playing program has trained the program on one map at a time. As a result, no matter how well the network did on one track, it would have to be retrained - probably significantly retrained

This is a crucial aspect when talking about RL. Most of the Trackmania AI attempts focuses on a track at a time, which is not really a problem since they want to, given an individual track, outperform the best human racers.

However, it is this nuance that a lot of more business oriented users don't get when being sold on some fancy new RL project. In the real world (think self-driving cars), we typically want agents to be way more able to generalize.

Most of the RL techniques we have do rather well in these kinds of constrained environments (in a sense they eventually start overfitting on the given environment), but making them behave well in more varied environments is way harder. A lot of beginner RL tutorials also fail to make this very explicit, and will e.g. show how to train an agent to find the exit in a maze without ever trying it on a newly generated maze :).

By @yuriks - 6 months
Wanted to point out that Linesight, the final project described in the article, has since released a new update last month, and it now beats world records in about a dozen maps between official and user made ones: https://www.youtube.com/watch?v=cUojVsCJ51I It's some really impressive stuff.
By @programd - 6 months
Make sure to read the followup post linked at the bottom of this one. It's vastly entertaining in watching an open source train wreck kind of way. You have to admire the persistance.

Tangentially related, is anybody besides the autonomous car folks developing games or virtual environments designed from the ground up for exporting machine learning APIs? By this I mean exporting game state and accepting game controls through the network without going through adapter contortions.

By @emporas - 6 months
> Nienders concluded that this was due to the difference in the information available. Sophy had information about the track curvature of the upcoming 6 seconds of track, based on the current speed. TMRL, however, only had distance measurements from the LIDAR. While the TMRL program could plan for the next turn, it could not plan two turns ahead, and this fundamentally limited the program to mere safe driving, avoiding walls and crashes, but never optimizing.

I think that point is an important one. ML algorithms work better when they are given better context. Especially in programming, it is clear the models are trained on code, rather than repositories. They know about files and repositories, but i always get the impression that they are totally clueless about whole programs.

What could be done better in code, is provide in training more data about where each function is located in the project, some other files where similar functions are defined or called and so on. In general before each code is fed into the training, to do a little bit of data mining in the project like the tree-hugger project [1] enables. Tree-hugger however is a little bit older code, and tree-sitter has advanced a lot the last 4 years.

In my opinion 5x to 10x in code, is within reach, with no need to increase GPU compute or electricity.

[1] https://github.com/autosoft-dev/tree-hugger

By @msephton - 6 months
Nice work! An enjoyable read. Edit: And the newer post!

I do love Trackmania. I'd like to play 2020 but alas I do not have a compatible computer. I mostly play the Wii version.

By @smokel - 6 months
For those who only read the comments, be sure to check out the videos by Yosh. They're amazing, and do a great job explaining how reinforcement learning works in practice:

https://youtube.com/watch?v=Dw3BZ6O_8LY

https://youtube.com/watch?v=kojH8a7BW04

By @jamesrom - 6 months
It never made sense to me why they raycast from the car. Humans don't play this way. The car is an abstraction the model doesn't need to care about it.

It literally doesn't matter if it's 1px or 100 meters to the wall, just learn to not hit it.

Instead, measure from the _camera_. That's all that matters. That's what humans do when we play.

Bonus: with this added perspective you'll be able to drive maps with hills and jumps. Not just flat maps.

By @budududuroiu - 6 months
Always wondered if something similar would be possible with milsim games like DCS Worlds.

E.g. can you improve or replicate missile intercept algorithms