The Matrix: Infinite-Horizon World Generation with Real-Time Interaction
The Matrix project aims to create an immersive digital universe with real-time interactions, featuring AAA-level graphics, frame-level precision, and open-sourced data to promote further research in world simulation technology.
Read original articleThe Matrix project represents a significant advancement in real-time world simulation, aiming to create an immersive digital universe reminiscent of the film "The Matrix." Developed by a collaboration of researchers from Alibaba Group, the University of Hong Kong, and the University of Waterloo, the system achieves frame-level precision in user interactions, delivering visuals that are nearly indistinguishable from reality. It boasts infinite generative capacity, allowing for endless exploration of diverse environments, including deserts, cities, and forests. The Matrix distinguishes itself from other generative models by offering AAA-level graphics, high resolution, and robust domain generalization, enabling users to navigate dynamic landscapes seamlessly. The project utilizes a unique GameData Platform to collect high-quality, precise action-frame pairs from various games, which will be open-sourced to foster further research and innovation. The Matrix operates at 16 frames per second, demonstrating strong adaptability from virtual to real-world settings. This pioneering work not only showcases the potential of AI in crafting interactive worlds but also sets a foundation for future developments in immersive technology.
- The Matrix project aims to create a fully immersive digital universe with real-time interaction.
- It features frame-level precision and AAA-level visuals, allowing for seamless exploration of diverse environments.
- The project utilizes a unique data collection platform to ensure high-quality datasets for training.
- It operates at 16 FPS, showcasing adaptability from virtual to real-world scenarios.
- The open-sourcing of data aims to encourage further research and innovation in world simulation technology.
Related
Diffusion Models Are Real-Time Game Engines
GameNGen, developed by Google and Tel Aviv University, simulates DOOM in real-time at over 20 frames per second using a two-phase training process, highlighting the potential of neural models in gaming.
New AI model can hallucinate a game of 1993's Doom in real time
Researchers from Google and Tel Aviv University developed GameNGen, an AI model that simulates Doom in real time, generating over 20 frames per second, but faces challenges with graphical glitches and visual consistency.
Oasis: A Universe in a Transformer
Oasis is an innovative AI model for real-time, open-world gameplay, generating interactions based on user inputs at 20 frames per second, with future enhancements planned for clarity and control.
Oasis: A Universe in a Transformer
Decart will launch Oasis, a real-time AI model for interactive gaming, on October 31, 2024. It generates gameplay based on user inputs, simulating physics and graphics with advanced techniques.
Oasis: A Universe in a Transformer
Oasis is an innovative open-world AI model by Decart and Etched, generating real-time gameplay at 20 frames per second, with plans for scaling and performance optimization, including a demo release.
- Concerns about the definition of "world" and the need for spatial consistency in simulations.
- Skepticism regarding the project's viability, with some labeling it as potential vaporware.
- Interest in the technological advancements that could make immersive experiences more feasible.
- Suggestions for using traditional game engines for stability and physics instead of generating everything from scratch.
- Excitement about the future possibilities of immersive technology and its applications in various fields.
Wouldn't a working approach be to just create a really low resolution 3D world in the traditional "3D game world" sense to get the spatial consistency. Then this crude map with attributes is fed into frame generation to create the resulting world? It wouldn't be infinite, but on the other hand no one has a need for an infinite world either. A spherical world solves the border issue pretty handily. As I understood it, there was some element of that in the new FS2024 (discussed yesterday on HN).
I am guessing the main thing holding this stuff back in terms of fidelity and consistency or generalization is just compute. But the new techniques they have here have just dramatically lowered the compute costs and increased the generalization.
Maybe just something like the giant Cerebras SRAM chips will get to the next 10 X in scale that smooths this out and pushes it closer to Star Trek. Or maybe some new paradigm like memristors.
But I'm looking forward to within just a few years being able to put on some fairly comfortable mixed reality glasses and just asking for whatever or whoever I want to appear in my home (for example) according to my whim.
Or, train it on a lot of how-to videos such as cooking. It just materializes an example of someone showing you exactly what you need to do right in your kitchen.
Here's another crazy idea: train on videos and interactions with productivity applications rather than games. In the future, for small businesses, we skip having the AI generate source code and just describe how the application works. The data and program state are just stored in a giant context window, and the application functionality changes the instant you make a request.
Though we were using map tiles at the time, we were developing a model that took photos and a GPS track to add information that better matched environmental conditions (cloud, better lighting, etc).
People still ask me to open-source or give them our source code, but the code was acquired, so that isn't possible. But I do regularly say that if I were to rebuild Ayvri today, I'd do it as an interactive video rather than loading tiles.
Clicking - nothing works.
Could be total vaporware for all we know.
It is an ad, a statement of achievement in case someone else states it first, or what?
Seems like it would be better on Youtube, it really doesn't offer much of use right now.
Related
Diffusion Models Are Real-Time Game Engines
GameNGen, developed by Google and Tel Aviv University, simulates DOOM in real-time at over 20 frames per second using a two-phase training process, highlighting the potential of neural models in gaming.
New AI model can hallucinate a game of 1993's Doom in real time
Researchers from Google and Tel Aviv University developed GameNGen, an AI model that simulates Doom in real time, generating over 20 frames per second, but faces challenges with graphical glitches and visual consistency.
Oasis: A Universe in a Transformer
Oasis is an innovative AI model for real-time, open-world gameplay, generating interactions based on user inputs at 20 frames per second, with future enhancements planned for clarity and control.
Oasis: A Universe in a Transformer
Decart will launch Oasis, a real-time AI model for interactive gaming, on October 31, 2024. It generates gameplay based on user inputs, simulating physics and graphics with advanced techniques.
Oasis: A Universe in a Transformer
Oasis is an innovative open-world AI model by Decart and Etched, generating real-time gameplay at 20 frames per second, with plans for scaling and performance optimization, including a demo release.