VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models
VFusion3D is a project developing scalable 3D generative models using video diffusion, to be presented at ECCV 2024. It offers pretrained models and a Gradio application for user interaction.
Read original articleVFusion3D is a project aimed at developing scalable 3D generative models using video diffusion models, authored by Junlin Han, Filippos Kokkinos, and Philip Torr. It will be presented at the European Conference on Computer Vision (ECCV) in 2024. The project leverages a minimal amount of 3D data combined with extensive synthetic multi-view data to create a large, feed-forward 3D generative model. This initiative seeks to advance the field of 3D generative and reconstruction models, contributing to the establishment of a robust 3D foundation. Users can clone the repository, install necessary dependencies, and utilize pretrained models for inference tasks such as rendering videos and exporting meshes. The project also includes a local Gradio application for user interaction. The inference code is derived from the OpenLRM project, and the licensing is primarily under CC-BY-NC, with some components under different licenses. For further details, users can access the project page and the GitHub repository.
- VFusion3D focuses on scalable 3D generative models from video diffusion models.
- The project will be presented at ECCV 2024.
- Users can clone the repository and set up a conda environment for installation.
- Pretrained models are available for various inference tasks.
- The project is licensed under CC-BY-NC with some components under different licenses.
Related
Unique3D: Image-to-3D Generation from a Single Image
The GitHub repository hosts Unique3D, offering efficient 3D mesh generation from a single image. It includes author details, project specifics, setup guides for Linux and Windows, an interactive demo, ComfyUI, tips, acknowledgements, collaborations, and citations.
AuraFlow v0.1: a open source alternative to Stable Diffusion 3
AuraFlow v0.1 is an open-source large rectified flow model for text-to-image generation. Developed to boost transparency and collaboration in AI, it optimizes training efficiency and achieves notable advancements.
3D visualization brings nuclear fusion to life
EPFL partners with EUROfusion to develop a cutting-edge 3D visualization system for nuclear fusion processes. The project enhances research and public understanding, led by Paolo Ricci at EPFL's Swiss Plasma Center.
Black Forest Labs – FLUX.1 open weights SOTA text to image model
Black Forest Labs has launched to develop generative deep learning models for media, securing $31 million in funding. Their FLUX.1 suite includes three model variants, outperforming competitors in image synthesis.
The open weight Flux text to image model is next level
Black Forest Labs has launched Flux, the largest open-source text-to-image model with 12 billion parameters, available in three versions. It features enhanced image quality and speed, alongside the release of AuraSR V2.
Related
Unique3D: Image-to-3D Generation from a Single Image
The GitHub repository hosts Unique3D, offering efficient 3D mesh generation from a single image. It includes author details, project specifics, setup guides for Linux and Windows, an interactive demo, ComfyUI, tips, acknowledgements, collaborations, and citations.
AuraFlow v0.1: a open source alternative to Stable Diffusion 3
AuraFlow v0.1 is an open-source large rectified flow model for text-to-image generation. Developed to boost transparency and collaboration in AI, it optimizes training efficiency and achieves notable advancements.
3D visualization brings nuclear fusion to life
EPFL partners with EUROfusion to develop a cutting-edge 3D visualization system for nuclear fusion processes. The project enhances research and public understanding, led by Paolo Ricci at EPFL's Swiss Plasma Center.
Black Forest Labs – FLUX.1 open weights SOTA text to image model
Black Forest Labs has launched to develop generative deep learning models for media, securing $31 million in funding. Their FLUX.1 suite includes three model variants, outperforming competitors in image synthesis.
The open weight Flux text to image model is next level
Black Forest Labs has launched Flux, the largest open-source text-to-image model with 12 billion parameters, available in three versions. It features enhanced image quality and speed, alongside the release of AuraSR V2.