Google Unveils RT-2: A Glimpse into the Next Generation of Robotic Intelligence

A Glimpse into the Next Generation of Robotic Intelligence

The latest advancement holds the potential to revolutionize the capabilities of robotic systems.

For years, the idea of robots seamlessly integrating into our daily lives has been a staple of science fiction. Google has now taken a significant step towards making this a reality with the introduction of Robotics Transformer 2 (RT-2), a pioneering vision-language-action (VLA) model. This model, trained using text and images from the internet, is designed to translate this knowledge into robotic actions, paving the way for a new era of informed and helpful robots. Quoting the tech giant’s words, “RT-2 can speak robot.”

Unlike traditional robots, RT-2 is not just about understanding objects and their properties. It’s about contextual understanding. For instance, while many robots can be trained to recognize an apple based on its properties, RT-2 can identify an apple in its environment, differentiate it from similar objects and know how to handle it.

Understanding RT-2’s capabilities

Recent advancements have bolstered robots’ reasoning and problem-solving abilities. Techniques like chain-of-thought prompting now allow robots to break down multistep tasks, while vision models like PaLM-E have been instrumental in helping robots better interpret their environment. Additionally, the success of RT-1 demonstrated that Transformers, a deep learning architecture that is known for its data generalization ability, could facilitate knowledge transfer between different types of robots.

Historically, robots operated on complex stacks of systems, requiring fragmented communication between reasoning and action systems. RT-2 streamlines this, allowing a singular model to handle intricate reasoning and directly produce robotic actions. Notably, RT-2 can utilize minimal robot training data and apply concepts from its training to guide robotic actions, even for unfamiliar tasks. For instance, while traditional systems needed explicit training to dispose of trash, RT-2 inherently understands the concept and can act accordingly.

Advancing robotic learning with knowledge transfer

RT-2 exhibits a promising capability to transfer knowledge into actions, indicating a potential for robots to swiftly adapt to new and unfamiliar situations and surroundings. In extensive testing, consisting of over 6,000 robotic trials, RT-2 performed as well as its predecessor, RT-1, when handling tasks from its training data (referred to as “seen” tasks). 

However, what sets the new model apart is its significant improvement in handling novel, previously unseen scenarios, achieving an impressive success rate of 62 percent, compared to RT-1’s 32 percent. This enhanced performance demonstrates the potential of RT-2 in enabling robots to effectively learn and apply knowledge to new challenges.

Also read:

Header image courtesy of Pexels

Press release link: https://www.blog.google/technology/ai/google-deepmind-rt2-robotics-vla-model/

SHARE THIS STORY

Share on facebook
Share on twitter
Share on linkedin
Share on email

RELATED POSTS

Eightfold AI Joins Department of Commerce Initiative for AI Safety

Eightfold AI, an AI-driven talent solution company, has announced its involvement in a Department of Commerce initiative aimed at fostering the development and deployment of trustworthy and secure artificial intelligence (AI). This initiative, under the auspices of the National Institute of Standards and Technology (NIST), introduces the U.S. AI Safety Institute Consortium (AISIC). The consortium aims to unite a diverse group of stakeholders, encompassing AI developers, users, academia, government and industry experts and civil society bodies to advance the mission of safe and reliable AI.

What Are Shadow Boards in the Workplace? Pros and Cons

In today’s rapidly evolving business landscape, companies are constantly seeking innovative ways to maintain a competitive edge. An intriguing development in this arena is the emergence of “shadow boards”—dynamic groups within organizations designed to complement the official board of directors by offering fresh perspectives on critical business strategies. This article explores the role of shadow boards in the modern workplace and highlights their benefits, challenges and how they are shaping future business practices.

Anthropologie and Pinterest Unveiled 2024 Bridal Trends at NYC Pop-Up Event

Anthropologie Weddings, a bridal collection from the global lifestyle brand Anthropologie, in collaboration with AnthroLiving and Terrain, debuted the Anthropologie Weddings x Pinterest Trend Pop-Up. This event, developed in partnership with Pinterest, was designed to bring emerging bridal trends, as identified by Pinterest Predicts, into tangible experiences. Offering inspiration and early access to trends, the pop-up showcased bridal designs, décor and lifestyle options that could be personalized for unique wedding visions.

Hello Group Introduces inSpaze: An Immersive Social App for Apple Vision Pro

Hello Group Inc., a prominent mobile social entertainment provider in China, introduces its immersive social application, inSpaze, an immersive social application exclusively for Apple Vision Pro users in the United States. This application, crafted for visionOS, leverages advanced technologies like 3Ds, Reality Converter and Reality Composer Pro, offering a unique spatial computing experience that connects users worldwide through Spatial Audio and 3D interactive content.

Are There More Layoffs Coming in 2024?

Even as we kick off the new year, the horrors of the year past are not behind us. In 2023, major tech companies undertook big layoffs—in January last year, Google reduced its headcount by 6% (it also recently hinted at a fresh round of layoffs this year); in December 2023, Spotify laid off 17% of its staff and more companies gave out pink slips. This trend has been ongoing for a couple of years since the pandemic shook global markets.