November 1st, 2024

Oasis: A Universe in a Transformer

Oasis is an innovative open-world AI model by Decart and Etched, generating real-time gameplay at 20 frames per second, with plans for scaling and performance optimization, including a demo release.

Read original article

Oasis is a groundbreaking open-world AI model developed by Decart and Etched, designed to generate real-time gameplay entirely through AI without a traditional game engine. It responds to user inputs, allowing actions like moving, jumping, and interacting with objects in a dynamic environment. The model utilizes a 500M parameter architecture, featuring a spatial autoencoder and a latent diffusion backbone, both based on Transformer technology. Oasis achieves real-time output at 20 frames per second, significantly faster than existing text-to-video models, which can take up to 20 seconds for a single frame. The model's efficiency is enhanced by Decart's inference engine and the upcoming Sohu ASIC, which is expected to support larger models at higher resolutions. Despite its impressive capabilities, Oasis faces challenges such as maintaining temporal stability, domain generalization, and precise control over game mechanics. Future developments will focus on scaling the model and optimizing performance to address these issues. The release includes the model's code, weights, and a playable demo, marking a significant step towards more complex interactive AI-driven worlds.

- Oasis is the first playable, real-time, open-world AI model.

- It generates gameplay based on user inputs without a traditional game engine.

- The model operates at 20 frames per second, outperforming existing text-to-video models.

- Future improvements aim to enhance model scaling and address current limitations.

- The project includes a live demo and the release of the model's code and weights.

New AI model can hallucinate a game of 1993's Doom in real time

Researchers from Google and Tel Aviv University developed GameNGen, an AI model that simulates Doom in real time, generating over 20 frames per second, but faces challenges with graphical glitches and visual consistency.

Roblox announces AI model for 3D game worlds

Roblox is developing an open-source generative AI tool to help users create 3D environments from text prompts, enhancing accessibility and aiming to capture 10% of global gaming content revenue.

Oasis: An interactive, explorable world model

Oasis is an AI model for interactive open-world game generation, optimized for high-definition video and real-time user control, anticipating a future dominated by AI-generated internet content.

Oasis: A Universe in a Transformer

Oasis is an innovative AI model for real-time, open-world gameplay, generating interactions based on user inputs at 20 frames per second, with future enhancements planned for clarity and control.

Oasis: A Universe in a Transformer

Decart will launch Oasis, a real-time AI model for interactive gaming, on October 31, 2024. It generates gameplay based on user inputs, simulating physics and graphics with advanced techniques.

38 comments

By @redblacktree - 6 months

"If you were dreaming in Minecraft" is the impression that I get. It feels very much like a dream with the lack of object permanence. Also interesting is light level. If you stare at something dark for a while or go "underwater" and get to the point where the screen is black, it's difficult to get back to anything but a black screen. (I didn't manage it in my one playthrough)

Very odd sensation indeed.

By @robotresearcher - 6 months

I don't see how you design and ship a game like this. You can't design a game by setting model weights directly. I do see how you might clone a game, eventually without all the missing things like object permanence and other long-term state. But the inference engine is probably more expensive to run than the game engine it (somewhat) emulates.

What is this tech useful for? Genuine question from a long-time AI person.

By @thrance - 6 months

> It's a video game, but entirely generated by AI

I ctrl-F'ed the webpage and saw 0 occurrence of "Minecraft". Why? This isn't a video game, this is a poor copy of a real video game you didn't even bother to say the name of, let alone credit it.

By @blixt - 6 months

Super cool, and really nice to see the continuous rapid progress of these models! I have to wonder how long-term state (building a base and coming back later) as well as potentially guided state (e.g. game rules that are enforced in traditional code, or multiplayer, or loading saved games, etc) will work.

It's probably not by just extending the context window or making the model larger, though that will of course help, because fundamentally external state and memory/simulation are two different things (right?).

Either way it seems natural that these models will soon be used for goal-oriented imagination of a task – e.g. imagine a computer agent that needs to find a particular image on a computer, it would continuously imagine the path between what it currently sees and its desired state, and unlike this model which takes user input, it would imagine that too. In some ways, to the best of my understanding, this already happens with some robot control networks, except without pixels.

By @duendefm - 6 months

It's not a videogame, it's a fast minecraft screenshot simulator where the prompt between each frame is the state of the input and the previous frames, with something of a resemblance of coherence.

By @jiwidi - 6 months

So basically trained a model on minecraft. This is not generalistic at all or whatsoever. Is not like the game comes from a prompt, it probably comes from a bunch of finetuning and gigadatasets from playing minecraft.

Would love to see some work like this but with world/games coming from a prompt.

By @whism - 6 months

Allow the user to draw into the frame buffer during play and feed that back, and you could have something very interesting.

By @brap - 6 months

Waiting line is too long so I gave up. Can anyone tell me, are the pixels themselves generated by the model, or does it just generate the environment which is rendered by “classical” means?

By @xyzal - 6 months

Maybe we should train models on Mario games to make Nintendo fight for the "Good Cause".

By @gessha - 6 months

I find this extremely disappointing. A diffusion transformer trained on Minecraft frames and accelerated on an ASIC... Okay?

From the demo(that doesn't work on Firefox) you can see that it's overfit to the training set and it doesn't have a consistent state transition.

If you define it as a Markov decision process with states being images, actions being keyboard/mouse inputs, the probability transition being the transformer model, the model is a very poor one. Turning the mouse around shouldn't result in a completely different world, it should result in the exact same point of space from different camera orientation. You can fake it by fudging with the training data and augmenting with walking a bit, doing a 360 camera rotation and continuing the exploration but that will just overfit to that specific seed.

The page says their ASICs model inference supports 60+ players. Where are they shown playing together? What's the point of touting multiplayer performance when realistically, the poor state transition will mean those 60+ players are playing single player DeepDream Minecraft?

By @jmartin2683 - 6 months

Why? Seems like a very expensive way to vaguely clone a game.

By @piperly - 6 months

From a research perspective, this approach isn’t new; David Ha and Danijar Hafner explored similar ideas years ago. However, the technique itself and the achievement of deploying it for testing by hundreds of users is commendable. It feels more like an experimental prototype than a viable replacement for mainstream gaming.

By @shanim_ - 6 months

Could you explain how the interaction between the spatial autoencoder (ViT-based) and the latent diffusion backbone (DiT-based) enables both rapid response to real-time input and maintains temporal stability across long gameplay sequences? Specifically, how does dynamic noising integrate with these components to mitigate error compounding over time in an autoregressive setup?

By @vannevar - 6 months

If anyone has ever read Tad Williams' Otherland series, this is basically the core idea. "The dream that is dreaming us."

By @djhworld - 6 months

I think this is really cool as a sort of art piece? It's very dreamlike and unsettling, especially with the music

By @0xTJ - 6 months

Seems like a neat idea, but too bad that the demo it doesn't work on Firefox.

By @amiramer - 6 months

So cool! Curious to see how it evolves.. seems like a portal into fully generated content, 0 applications. So exciting. Will it also be promptable at some point?

By @joshdavham - 6 months

Incredible work! I think once we’re able to solidly emulate these tiny universes, we can then train agents within them to make even more intelligent AI.

By @aaladdin - 6 months

How would you verify that real world physics actually hold here? Otherwise, such breaches could be maliciously and unfairly exploited.

By @mrtnl - 6 months

Very cool tech demo! Curious to see if we continue to generate environments in this level or move more to generating the physics

By @GaggiX - 6 months

Kinda hyped to see how this model (or a much bigger one) will run on Etched's transformer ASIC, Sohu, if it ever comes out.

By @th0ma5 - 6 months

This feels like a nice preview at the bottom of the kinds of unsolvable issues these things will always have to some degree.

By @TalAvner - 6 months

This is next level! I can't believe it's all AI generated in real time. Can't wait to see what's next.

By @goranim - 6 months

Love it! this virtual world looks so goo and it is also changing really fast so seems like a very powerful model!

By @Daroar - 6 months

I can see where they are going with it and wow! Truly the proof that we are all indeed in a simulation.

By @drdeca - 6 months

This apparently currently only supports chrome. I hope it will support non-chrome browsers in the future.

By @therein - 6 months

Queue makes it untestable. It isn't running client-side? What's with the queueing?

By @pka - 6 months

Negative comments are so weird, it's like people forgot what GPT 2 was like. I know this isn't completely new, but it's a world simulation inside a goddamn LLM. Not perfect, not coherent over longer time periods, but still insane. I swear if tomorrow magic turned out to be real and wizards start controlling the literal fabric of the universe people will be like "meh" before the week ends :D

By @gunalx - 6 months

Really cool tech demo. What for the most part impressed me is the inference speed. But I don't really see any use for this unless a way to store worldstate to avoid the issue of it forgetting what it just said.

By @petersonh - 6 months

Very cool - has a very dreamlike quality to it

By @jhonj - 6 months

tried their not-a-game and it was SICK to play knowing it's not a game engine. really sick. When did these Decart ppl started working on that. must be f genius ppl

By @duan2112 - 6 months

Love it!!!

By @keidartom - 6 months

So cool!

By @robblbobbl - 6 months

Me gusta!

By @hesyechter - 6 months

Very very cool, i love it Good luck

New AI model can hallucinate a game of 1993's Doom in real time

Roblox announces AI model for 3D game worlds

Roblox is developing an open-source generative AI tool to help users create 3D environments from text prompts, enhancing accessibility and aiming to capture 10% of global gaming content revenue.

Oasis: An interactive, explorable world model

Oasis is an AI model for interactive open-world game generation, optimized for high-definition video and real-time user control, anticipating a future dominated by AI-generated internet content.

Oasis: A Universe in a Transformer

Oasis is an innovative AI model for real-time, open-world gameplay, generating interactions based on user inputs at 20 frames per second, with future enhancements planned for clarity and control.

Oasis: A Universe in a Transformer

Decart will launch Oasis, a real-time AI model for interactive gaming, on October 31, 2024. It generates gameplay based on user inputs, simulating physics and graphics with advanced techniques.

Oasis: A Universe in a Transformer

Related

New AI model can hallucinate a game of 1993's Doom in real time

Roblox announces AI model for 3D game worlds

Oasis: An interactive, explorable world model

Oasis: A Universe in a Transformer

Oasis: A Universe in a Transformer

Related

New AI model can hallucinate a game of 1993's Doom in real time

Roblox announces AI model for 3D game worlds

Oasis: An interactive, explorable world model

Oasis: A Universe in a Transformer

Oasis: A Universe in a Transformer