April 2nd, 2025

How Google built its Gemini robotics models

Google DeepMind has launched Gemini Robotics models that enhance robots' capabilities to learn complex tasks, focusing on embodied reasoning for effective interaction, promoting adaptability for future applications in various industries.

Read original articleLink Icon
How Google built its Gemini robotics models

Google DeepMind has introduced a new family of Gemini Robotics models designed to enhance the capabilities of robots. These models enable robots to learn and perform complex tasks, such as preparing salads, playing games like Tic-Tac-Toe, and folding origami. The head of robotics, Carolina Parada, highlighted a significant moment when a bi-arm ALOHA robot successfully executed a "slam dunk" with a toy basketball, showcasing the model's ability to understand and perform actions it had never encountered before. The Gemini Robotics models are multimodal, integrating physical actions with outputs like text and audio, allowing robots to adapt to new objects and environments without additional training. The Gemini Robotics-ER model focuses on embodied reasoning, enabling robots to recognize and interact with their surroundings effectively. This approach contrasts with traditional methods that train robots for single tasks, as the Gemini models are trained on a wide range of tasks to promote generalization. The adaptability of these models is crucial for future applications in various industries, including complex environments and human-centric spaces. Google aims to develop robots that can assist with everyday tasks, moving closer to a future where robots are integral to daily life.

- Google DeepMind has launched Gemini Robotics models for advanced robotic capabilities.

- The models enable robots to learn and perform complex tasks without prior exposure.

- Gemini Robotics-ER focuses on embodied reasoning for effective interaction with environments.

- The training approach emphasizes broad task learning over single-task training.

- Future applications include assisting in complex industries and human-centric environments.

Link Icon 9 comments
By @lima - 4 days
They can do that, yet somehow, Gemini Assistant on Pixel phones still fails to reliably set timers or add shopping list items :-)

(which worked fine with Google Assistant)

By @dachworker - 4 days
The "how" is completely missing, but if they can get this to work semi reliably it will be ChatGPT x100 in terms of impact.
By @harmmonica - 4 days
Even if Google's robotics technology (software and hardware) is leading edge does anyone think they'll actually be able to productize it? Seems similar to how they were the pre-product leaders in transformers and then fumbled any advantage they had to ChatGPT. It seems like something's missing from Google where they can't get from research to product effectively. Waymo perhaps a good counterexample if you think where they are today is product/market fit, but I can't shake the feeling that Google more often than not can't seem to get things to market or even if they do they give up on them before they take hold.

Just wondering if anyone has a strong feeling or, better yet, insight on this regarding their robotics efforts.

By @abidhusain - 3 days
The advancements in AI and robotics are incredibly exciting! With complex systems like Gemini, companies will need to rely on specialized teams to bring these innovations to life.

Outsourcing specific roles such as AI research or robotics engineers can help companies bring top-tier talent into the fold without the burden of full-time recruitment. It's fascinating to see how outsourcing can complement R&D in cutting-edge industries like robotics.

Curious to see how this shifts the industry, especially in terms of scalability and speed to market

By @otherayden - 3 days
It's terrifying to think that robots like this will probably be used in the defense industry at some point. If the robot understands something as general as "put the erasers away", imagine "kill all enemies".
By @hansmayer - 3 days
"Pick up the basketball and slam-dunk it". The killer use-case we've been waiting on for so long :)
By @barbazoo - 3 days
> Sounds like someone will get some help with those chores — eventually.

Aaaaw that's nice. Except it's all military under the hood but nice that they try to make us think they'll fold our laundry instead.

By @cozyman - 4 days
just curious, what would it do if you asked it to kill someone? does it follow the laws of robotics?
By @free652 - 3 days
April 1st!