June 22nd, 2024

SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code

SceneCraft is an advanced Large Language Model (LLM) Agent converting text to 3D scenes in Blender. It excels in spatial planning, asset arrangement, and scene refinement, surpassing other LLM agents in performance and human feedback.

Read original articleLink Icon
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code

SceneCraft is a Large Language Model (LLM) Agent designed to convert text descriptions into Blender-executable Python scripts for rendering complex 3D scenes with up to a hundred assets. The process involves spatial planning and arrangement, achieved through advanced abstraction, strategic planning, and library learning. SceneCraft creates a scene graph to define spatial relationships among assets, generates Python scripts based on this graph, and refines scenes using vision-language models like GPT-V. It also incorporates a library learning mechanism to improve continuously without extensive parameter tuning. Evaluation shows SceneCraft outperforms other LLM-based agents in adhering to constraints and receiving positive human assessments. The system's capabilities are demonstrated by reconstructing detailed 3D scenes from the Sintel movie and guiding a video generative model with the scenes as control signals.

Related

AI-powered conversion from Enzyme to React Testing Library

AI-powered conversion from Enzyme to React Testing Library

Slack engineers transitioned from Enzyme to React Testing Library due to React 18 compatibility issues. They used AST transformations and LLMs for automated conversion, achieving an 80% success rate.

We no longer use LangChain for building our AI agents

We no longer use LangChain for building our AI agents

Octomind switched from LangChain due to its inflexibility and excessive abstractions, opting for modular building blocks instead. This change simplified their codebase, increased productivity, and emphasized the importance of well-designed abstractions in AI development.

GitHub – Karpathy/LLM101n: LLM101n: Let's Build a Storyteller

GitHub – Karpathy/LLM101n: LLM101n: Let's Build a Storyteller

The GitHub repository "LLM101n: Let's build a Storyteller" offers a course on creating a Storyteller AI Large Language Model using Python, C, and CUDA. It caters to beginners, covering language modeling, deployment, programming, data types, deep learning, and neural nets. Additional chapters and appendices are available for further exploration.

Francois Chollet – LLMs won't lead to AGI – $1M Prize to find solution [video]

Francois Chollet – LLMs won't lead to AGI – $1M Prize to find solution [video]

The video discusses limitations of large language models in AI, emphasizing genuine understanding and problem-solving skills. A prize incentivizes AI systems showcasing these abilities. Adaptability and knowledge acquisition are highlighted as crucial for true intelligence.

Homegrown Rendering with Rust

Homegrown Rendering with Rust

Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.

Link Icon 1 comments
By @HanClinto - 4 months
Looks cool, but no demo or code that I can find available.