September 17th, 2024

WonderWorld: Interactive 3D Scene Generation from a Single Image

WonderWorld is a novel framework from Stanford and MIT that generates interactive 3D scenes from a single image in under 10 seconds, allowing user-defined content and real-time navigation.

Read original article

ExcitementAdmirationCuriosity

WonderWorld: Interactive 3D Scene Generation from a Single Image

WonderWorld is a novel framework developed by researchers from Stanford University and MIT for interactive 3D scene generation using a single image as input. The system allows users to specify scene contents and layouts through text and navigate the generated scenes in real-time. Utilizing a technique called Fast LAyered Gaussian Surfels (FLAGS), WonderWorld can create connected and diverse 3D scenes in under 10 seconds on a single A6000 GPU. This approach overcomes limitations of existing methods that typically require multiple views and extensive optimization processes. The FLAGS representation enables faster scene generation by using a geometry-based initialization, which streamlines the optimization process. Additionally, the system incorporates guided depth diffusion to ensure coherent geometry across generated scenes. Users can interact with the virtual environment using keyboard controls or touch screen gestures, enhancing the experience of content creation and exploration. WonderWorld demonstrates significant potential for user-driven applications in virtual environments, making it a promising tool for various creative and educational purposes.

- WonderWorld generates interactive 3D scenes from a single image in under 10 seconds.

- The system allows users to specify scene contents and layouts via text.

- It employs Fast LAyered Gaussian Surfels (FLAGS) for efficient scene representation.

- Users can navigate and explore generated scenes in real-time.

- The framework supports various camera movement styles for scene generation.

Niantic Studio: Free Browser-Based 3D and AR Game Engine in Beta

Niantic Studio in open beta improves features and documentation. It's a real-time XR visual editor and game engine for crafting immersive 3D and XR experiences on the web browser without software downloads.

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

GenWarp is a framework for generating novel views from a single image using a semantic-preserving generative model. It combines diffusion techniques with monocular depth estimation, outperforming existing methods in evaluations.

I spent an evening on a fictitious web

Websim.ai is a fictitious platform enabling users to create and explore web applications without traditional domain limits. It fosters creativity akin to Roblox, with support from Google's Chrome Developer Relations team.

Newest social network does not suck

Wonderland is a new social network for nature journaling, supported by John Muir Laws and the Wild Wonder Foundation, fostering community engagement among beginners and experienced nature journalists.

UE5 Nanite in WebGPU

The Nanite WebGPU project replicates Unreal Engine 5's technology for web rendering using WebGPU in Chrome, featuring meshlet LOD, software rasterization, and interactive demo scenes for real-time adjustments.

AI: What people are saying

The comments on the WonderWorld framework reflect excitement and curiosity about its potential applications and capabilities.

Many users express enthusiasm for the technology, calling it "amazing" and "incredible."
There are suggestions for creative uses, such as creating interactive experiences and virtual environments.
Some commenters inquire about technical aspects, like the possibility of voxel output.
Users envision combining the technology with existing data, like Google Street View, for expansive applications.
Overall, there is a strong desire for public access to the technology for personal experimentation.

12 comments

By @stephen_cagle - 7 months

If you click on the image of "Link" (I know he is not really) in the "Interactive Viewing" section then you can see that in front of him (out of view) is a bunch of noise. I think it is interesting that it would predict randomness above just predicting nothing being there.

This is awesome tech.

By @opdahl - 7 months

Super impressive, and I can see it being useful in many cases already. Especially making interactive experiences in combination with position tracker of a user in a room. As you move around the room your perspective changes.

In a more creative approach I could imagine creating fake windows using flat-screen TVs in this approach as well. As you move around the room the perspectives would change as well, giving an illusion of the windows being real. Of course this would only work for a single person at a time but it would be quite interesting to experience. It should not be too difficult to hack it together as a solo dev.

By @anthk - 7 months

This is like 1997's Blade Runner game camera (and from the movie too):

https://youtu.be/DRx2Leb2yDE?t=1680

By @ghayes - 7 months

Does anyone know if there are variants of this that output voxels? It feels like a more concrete representation of the space versus Gaussian splats.

By @jayantbhawal - 7 months

This is AMAZING!

I hope this is released for public use at some point. I'd love to run it through some of my older photos to see what it does with them.

By @robertclaus - 7 months

It feels like "3D" is a stretch given the approach they're using. Obviously the result is pretty cool, but I suspect anything built using this tech is going to have a very distinct feel (almost like sprite based video games).

By @tetris11 - 7 months

This is incredible. You could build entire games this way.

By @owenpalmer - 7 months

Imagine Google street view data put to use in combination with this. You would essentially have an open world game of any city on earth.

By @android521 - 7 months

can wait for the code /api

By @LarsDu88 - 7 months

Very cool!

By @fnordpiglet - 7 months

This is the future I was promised. Take my money please.

By @deathsentience - 7 months

How very supercalifragilisticexpialidocious!

WonderWorld: Interactive 3D Scene Generation from a Single Image

Related

Niantic Studio: Free Browser-Based 3D and AR Game Engine in Beta