Ted Xiao Profile
Ted Xiao

@xiao_ted

Followers
13K
Following
6K
Media
213
Statuses
1K

Robotics and Gemini @GoogleDeepMind. Posts about large models, robot learning, and scaling. Opinions my own.

San Francisco
Joined October 2013
Don't wanna be here? Send us removal request.
@xiao_ted
Ted Xiao
2 months
Robotics + AI has completely transformed: a perfect storm of AI breakthroughs, hardware innovation, and capital inflow. But are general robot foundation models truly just around the corner?. At recent talks, I took an honest look at hype vs. reality to share what’s missing 🧵👇
5
29
176
@xiao_ted
Ted Xiao
1 year
There will be 3-4 massive news coming out in the next weeks that will rock the robotics + AI space. Adjust your timelines, it will be a crazy 2024 📈.
91
232
2K
@xiao_ted
Ted Xiao
1 year
Both MSFT and OpenAI have won massively from capital injections, but Microsoft Research (MSR) has been majorly screwed over. World-class researchers trained and hired for innovative fundamental research — now relegated to “ChatGPT for X” papers because leadership literally does.
26
47
688
@xiao_ted
Ted Xiao
2 years
🚨New RL impact just dropped🚨. 1) My friend is a high level Rocket League player and just alerted me that an open-sourced agent trained with reinforcement learning + self play ( has been steamrolling on public servers! It's in the top 0.5% ELO bracket.
11
92
611
@xiao_ted
Ted Xiao
1 year
🚨Big things are happening in humanoid robotics!🚨. As we saw with drones and quadruped robots, it can take a mere decade for bleeding edge R&D areas to become "solved" platforms for commercial consumer use cases. Once open-ended research questions around reliability, dexterity,
Tweet media one
Tweet media two
Tweet media three
12
87
435
@xiao_ted
Ted Xiao
30 days
New phenomenon appearing: the latest generation of foundation models often switch to Chinese in the middle of hard CoT thinking traces. Why? AGI labs like OpenAI and Anthropic utilize 3P data labeling services for PhD-level reasoning data for science, math, and coding; for.
@RishabJainK
Rishab Jain
1 month
Why did o1 pro randomly start thinking in Chinese? No part of the conversation (5+ messages) was in Chinese. very interesting. training data influence
Tweet media one
25
45
439
@xiao_ted
Ted Xiao
2 years
@aidangomezzz you, a heathen, cheering mindlessly when loss go down. me, an X-risk chad, carefully measuring each gradient by hand to make sure it's not over the proscribed limit to prevent FOOM. we are not the same.
3
14
376
@xiao_ted
Ted Xiao
1 year
I can’t emphasize enough how mind-blowing extremely long token context windows are. For both AI researchers and practitioners, massive context windows will have transformative long-term impact, beyond one or two flashy news cycles. ↔️. “More is different”: Just as we saw emergent
Tweet media one
6
58
298
@xiao_ted
Ted Xiao
2 years
The golden days of internet-scale models achieving unprecedented zero-shot results seem to be waning. The new Big Thing is subsequent fine tuning with humans increasingly out of the loop. How does this work?. Let’s explore *Prior Amplification* 🔎. (1/N)
Tweet media one
4
41
301
@xiao_ted
Ted Xiao
2 years
The optimism in robotics research is absolutely incredible these days! I believe all the pieces we need for a “modern attempt at embodied intelligence” are ready. At recent talks, I pitched a potential recipe, and I’d like to share it with you. Let’s break down the key points 🔑
Tweet media one
10
49
261
@xiao_ted
Ted Xiao
1 year
“Sora doesn’t show any new technical innovations” - this take misses the point. 🎯. Sora (and other great polished OAI releases) are not *meant* to provide new knowledge to the research community. They are not papers, careful scientific experiments, or theory that adds to the sum.
6
34
252
@xiao_ted
Ted Xiao
1 year
Instead of just telling robots “what to do”, can we also guide robots by telling them “how to do” tasks?. Unveiling RT-Trajectory, our new work which introduces trajectory conditioned robot policies. These coarse trajectory sketches help robots generalize to novel tasks! 🧵⬇️
3
46
251
@xiao_ted
Ted Xiao
3 years
Our project SayCan ( wins Best Paper Award at the RSS Workshop on Scaling Robot Learning! Thanks to all who came to the poster session with great questions and feedback 😁 . The future of "LLMs for Robotics and Robotics for LLMs" is brighter than ever!
Tweet media one
4
25
219
@xiao_ted
Ted Xiao
1 year
(Not AGI 😅).
20
6
199
@xiao_ted
Ted Xiao
1 month
I’ve gradually come around to two paths to embodied AGI that I was very skeptical of before:.1️⃣ solving robotics via reasoning.2️⃣ solving robotics via world modeling. I was previously doubtful not of these approaches themselves, but of timelines and efficiency; the “optimal”
8
25
210
@xiao_ted
Ted Xiao
9 months
Open X-Embodiment wins the Best Paper Award at #ICRA2024 🎉🤖!. An unprecedented Best Paper 170+ author list (most didn’t fit on the slide) may be a record for ICRA! So amazing to see what a collaborative community effort can accomplish in pushing robotics + AI forward 🚀
Tweet media one
5
31
187
@xiao_ted
Ted Xiao
1 year
Announcing Open X-Embodiment! One of the most ambitious and large-scale robot learning collaborations to date. Intelligent robots have seen tremendous progress the past years by leveraging big real-world datasets, high capacity network architectures, and
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
40
183
@xiao_ted
Ted Xiao
2 years
Robot learning systems have scaled to many complex tasks, but generalization to novel objects is still challenging. Can we leverage pre-trained VLMs to expand the open-world capabilities of robotic manipulation policies?. Introducing MOO: Manipulation of Open-World Objects!🐮🧵👇
Tweet media one
3
37
178
@xiao_ted
Ted Xiao
2 years
My team is looking for outstanding PhD students (graduating after 2023) who are interested in a student researcher internship exploring how simulation can improve real robot manipulation capabilities! .
5
33
172
@xiao_ted
Ted Xiao
11 months
Our team is hiring strong research engineers to work on scaling robotics + AI!. Apply here:
7
40
164
@xiao_ted
Ted Xiao
6 months
🚨In case you missed it: it’s been non-stop technical progress updates from Chinese humanoid companies the past 48 hours! 🚨. In just the last 2 days, batched updates from numerous exciting Chinese humanoid companies all dropped at once:. 1) @UnitreeRobotics showcases technical
Tweet media one
Tweet media two
Tweet media three
5
32
161
@xiao_ted
Ted Xiao
2 years
I'll be @Stanford tomorrow for a guest lecture for CS25 (organized by @DivGarg9 ). If you're interested in how internet-scale models can dramatically accelerate scaling robot learning in the real world, come say hi!.
2
13
157
@xiao_ted
Ted Xiao
2 years
SayCan wins the Special Innovation Award at #CoRL2022!! . ✅ LLMs for robotics.✅ real world tasks at scale.✅ 13 robots, MANY collaborators, 17 months.✅ 3 live demos from across the world
Tweet media one
3
12
156
@xiao_ted
Ted Xiao
6 months
The field of Robotics + AI is at a very interesting moment in time right now because we’re seeing the evolution of how optimism from the last 10 years of research is carrying over to new frontiers: towards commercialization, towards more complex morphologies and applications,.
2
18
111
@xiao_ted
Ted Xiao
2 years
What if you used LLMs for… all parts of RL?. GPT-4 as the Actor and Critic.GPT-4 as the Bellman Update.GPT-4 as the reward function.Vector DB as the replay buffer.GPT-4 as the importance sampling logic.
@DrJimFan
Jim Fan
2 years
What if we set GPT-4 free in Minecraft? ⛏️. I’m excited to announce Voyager, the first lifelong learning agent that plays Minecraft purely in-context. Voyager continuously improves itself by writing, refining, committing, and retrieving *code* from a skill library. GPT-4 unlocks
9
17
147
@xiao_ted
Ted Xiao
4 years
Differentiable physics engines show *huge* wall time speed ups vs. standard MuJoCo 🤯
Tweet media one
@hardmaru
hardmaru
4 years
Speeding Up Reinforcement Learning with a New Physics Simulation Engine . Blog post by @bucketofkets on Brax, a new physics simulation engine that matches the performance of a large compute cluster with just a single TPU or GPU.
2
22
137
@xiao_ted
Ted Xiao
1 year
New @GoogleDeepMind blog post covering 3 recent works on AI + robotics!. 1. Auto-RT scales data collection with LLMs and VLMS. 2. SARA-RT significantly speeds up inference for Robot Transformers. 3. RT-Trajectory introduces motion-centric goals for robot generalization.
@GoogleDeepMind
Google DeepMind
1 year
How could robotics soon help us in our daily lives? 🤖. Today, we’re announcing a suite of research advances that enable robots to make decisions faster as well as better understand and navigate their environments. Here's a snapshot of the work. 🧵
6
22
134
@xiao_ted
Ted Xiao
2 years
Some recent news: 5 projects to appear at #RSS2023 and 1 at #ICML2023! 🥳🤖. 1) RT-1: 2) DIAL: 3) ROSIE: 4) RLS: 5) JSRL: 6) LLM + Robotics Demos, TBA!.
2
14
134
@xiao_ted
Ted Xiao
7 months
Action-packed day of robotics + AI at #RSS2024 tomorrow on Monday, July 15!. Hope to see you:.- I'll be giving a talk at the GenAI-HRI workshop at 11AM.- I'll be at the Data Gen workshop from 2PM - 4PM (
Tweet media one
1
20
133
@xiao_ted
Ted Xiao
2 years
Robot-language datasets have enabled tremendous progress in robotics🤖. However, semantic concepts may not be fully captured by existing language labels, which are often expensive to collect. In our new paper, we study how we can get more mileage out of existing datasets! 🧵👇
1
27
129
@xiao_ted
Ted Xiao
7 months
Announcing one of the most unique and ambitious robot competitions ever: 🏁The Earth Rover Challenge at #IROS2024!. ✅Globally distributed in-the-wild evaluation.✅Real world navigation task settings.✅Large training dataset provided. Details below 👇
3
25
132
@xiao_ted
Ted Xiao
2 years
Excited to share 3 papers recently accepted at @corl_conf 🤖:. 1) SayCan: Robots ground language models plans with what’s achievable and contextually appropriate. 1.5 year effort with tons of collaborators!. Paper: Paper sites:
8
8
127
@xiao_ted
Ted Xiao
3 years
no good deed unpunished 😂
Tweet media one
@xiao_ted
Ted Xiao
3 years
Paper reviewing is not only an honorable civic duty but is also very rewarding (very +EV). It’s the Barry’s Bootcamp of AI research.
3
0
120
@xiao_ted
Ted Xiao
11 months
The writing is clearly on the wall now: a path to embodied intelligence is more clear than ever!. Congrats to OpenAI and Figure on the exciting partnership, and welcome to the intelligent robotics game 📈.
@Figure_robot
Figure
1 year
Last month we demonstrated Figure 01 making coffee only using neural networks. This is a fully learned, end-to-end visuomotor policy mapping onboard images to low level actions at 200hz. Next up: excited to push the boundaries on AI learning with OpenAI.
4
11
122
@xiao_ted
Ted Xiao
1 year
6 years of paradigm shifts in robot learning! What's next? 📈
Tweet media one
4
18
122
@xiao_ted
Ted Xiao
1 year
I am convinced that the amount of robot data collected for learning will be larger in 2024-2025 than all previous years combined. Amazing progress in creative teleop (gloves, puppeteering), in cross-embodiment learning, and in large scale industry data engines (humanoids) 🚀.
@chichengcc
Cheng Chi
1 year
Can we collect robot data without any robots?. Introducing Universal Manipulation Interface (UMI). An open-source $400 system from @Stanford designed to democratize robot data collection. 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
2
21
116
@xiao_ted
Ted Xiao
1 year
We’re living through the golden age of AI Research: . - Highly leveraged impact thanks to successful methods scaling and transferring easily to other domains (ie. generative modeling ideas landing in language, robotics, representation learning). - Relatively low ramp-up cost due.
@_aidan_clark_
Aidan Clark
1 year
Seeing people tweet about doing PhD apps and I’ll just say I got into 0/8 of the programs I applied to and things turned out great. There are lots of opportunities for research, don’t stress :).
4
8
118
@xiao_ted
Ted Xiao
10 months
Extremely thought-provoking work that essentially says the quiet part out loud: general foundation models for robotic reasoning may already exist *today*. LLMs aren’t just about language-specific capabilities, but rather about vast and general world understanding.
@Ed__Johns
Edward Johns
10 months
Very excited to announce: Keypoint Action Tokens!. We found that LLMs can be repurposed as "imitation learning engines" for robots, by representing both observations & actions as 3D keypoints, and feeding into an LLM for in-context learning. See: More 👇
1
17
117
@xiao_ted
Ted Xiao
8 months
> year is 2031.> be me, token farmer in the slop mines.> bring in 20k novel tokens for weighing at the Token Utility station.> my token haul is two stddevs over the compressibility threshold, phew.> get my daily blueprint nutrition allocation.> trickle down tokenomics utopia.
@khoomeik
Rohan Pandey
9 months
📢 Excited to finally be releasing my NeurIPS 2024 submission!. Is Chinchilla universal? No! We find that:.1. language model scaling laws depend on data complexity.2. gzip effectively predicts scaling properties from training data. As compressibility 📉, data preference 📈. 🧵⬇️
Tweet media one
8
8
116
@xiao_ted
Ted Xiao
2 years
How do we connect the huge diverse world of bits (digital domain) with the physical world of atoms (robotics domain)?. Our team has begun exploring how to bridge this gap with a variety of different approaches! The projects in blue were announced in just the past 2 weeks:
Tweet media one
3
12
112
@xiao_ted
Ted Xiao
2 years
Real robot data is expensive to collect; can diffusion models perform meaningful and diverse visual data augmentation?. Presenting "Scaling Robot Learning with Semantically Imagined Experience" at #RSS2023 next week!. Website: Session: Tues 7/12, 3-5PM
2
14
108
@xiao_ted
Ted Xiao
5 months
Promising progress from 1X on learned world models which improve with more experience and physical interaction data. What I'm excited about:.- World models are likely the only path forward for reproducible and scalable evaluations in *multi-agent settings*; see success of world.
@1x_tech
1X
5 months
hello, world model.
3
10
104
@xiao_ted
Ted Xiao
2 years
While LLMs have been amazing for high-level reasoning, low-level policy learning is still the bottleneck in robotics. Over 17 months (!!), our team has scaled a massively multitask Transformer-based model to over 700 tasks using 130k real-world episodes. Check out the 🧵 👇.
@hausman_k
Karol Hausman
2 years
Introducing RT-1, a robotic model that can execute over 700 instructions in the real world at 97% success rate!. Generalizes to new tasks✅.Robust to new environments and objects✅.Fast inference for real time control✅.Can absorb multi-robot data✅.Powers SayCan✅.🧵👇
3
5
104
@xiao_ted
Ted Xiao
2 years
Reinforcement learning (!).at scale in the real world (!!).for useful robotics tasks (!!!).in multiple "in-the-wild" offices (!!!!) . A culmination of years of effort -- so excited finally be able to share this publicly! 🤖.
@hausman_k
Karol Hausman
2 years
Very exited to announce our largest deep RL deployment to date: robots sorting trash end-to-end in real offices!. (aka RLS). This project took a long time (started before SayCan/RT-1/other newer works) but the learnings from it have been really valuable.🧵
3
15
94
@xiao_ted
Ted Xiao
1 year
Happy to share that RT-Trajectory has been accepted as a Spotlight (Top 5%) at #ICLR2024!. This was my first last-author project, it was a ton of fun collaborating with a strong team led by @Jiayuan_Gu 🥳. Blogpost: Website: 🧵⬇️.
@xiao_ted
Ted Xiao
1 year
Instead of just telling robots “what to do”, can we also guide robots by telling them “how to do” tasks?. Unveiling RT-Trajectory, our new work which introduces trajectory conditioned robot policies. These coarse trajectory sketches help robots generalize to novel tasks! 🧵⬇️
5
14
96
@xiao_ted
Ted Xiao
3 years
my team is still using TF2, and in my humble opinion we are still on the bleeding edge of robot learning 🤷🏻‍♂️.
@ylecun
Yann LeCun
3 years
The great competition between Deep Learning frameworks enters a new phase. Now that Google's TensorFlow has lost to Meta's PyTorch, Google is internally switching to JAX.
5
6
90
@xiao_ted
Ted Xiao
1 year
Made a new friend today, Unitree H1!. In 2019, Unitree visited to showcase an early version of the Go1, which has become a stable and affordable market leader. It’ll be fun to look back in 2027 on today and see how much progress has been made in humanoid robotics!
Tweet media one
1
2
90
@xiao_ted
Ted Xiao
1 year
Foundation models have shown impressive results on robots via language or code -- but these may not be good fits for grounded robotics problems. Our method PIVOT poses robot control as a VQA problem and introduces *iterative* visual optimization over robot actions! 🧵
1
19
91
@xiao_ted
Ted Xiao
1 year
AutoRT is now out on Arxiv!. Check out how we set up fleet-scale data collection by leveraging Foundation Models for robot orchestration 📈 . Website: Paper: Original Thread:
@_akhaliq
AK
1 year
Google presents AutoRT. Embodied Foundation Models for Large Scale Orchestration of Robotic Agents. paper page: demonstrate AutoRT proposing instructions to over 20 robots across multiple buildings and collecting 77k real robot episodes via both
2
19
85
@xiao_ted
Ted Xiao
1 month
Extremely saddened to hear about Felix’s passing. His thoughtful research contributions played a major part of my own development as a scientist. Most recently, his poignant essay on AI and stress humanized the pressure cooker that is today’s AI environment. It resonated deeply.
@FelixHill84
Felix Hill
4 months
Do you work in AI? . Do you find things uniquely stressful right now, like never before? . Haver you ever suffered from a mental illness? . Read my personal experience of those challenges here: .
3
6
89
@xiao_ted
Ted Xiao
10 months
There is increased decoupling between academia and real-world impacts, especially in robotics. Reviewing an impressive demo paper (end-to-end system deployed in production warehouses on 100ks of wild objects) and it's getting slaughtered by other reviewers. Many such cases.🤡.
5
3
86
@xiao_ted
Ted Xiao
5 months
Action-conditioned world models are one step closer! Neural simulations hold a lot of promise for scaling up real-world interaction data, especially for domains which physics-based simulators struggle at. 🚀.
@_akhaliq
AK
5 months
Google presents Diffusion Models Are Real-Time Game Engines. discuss: We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality.
0
14
84
@xiao_ted
Ted Xiao
5 months
“yeah that’s definitely just a dude in a suit” - the highest compliment a humanoid company can get. Congrats to @ericjang11 @BerntBornich and the team at 1X for a new SOTA humanoid!.
@1x_tech
1X
5 months
Introducing NEO Beta. Designed for humans. Built for the home.
3
2
81
@xiao_ted
Ted Xiao
5 months
Announcing the 3rd Workshop on Language and Robot Learning at #CoRL2024 in Munich! This year's theme is "Language as an Interface". Featuring a great speaker lineup and two panels 🤖. Website: CfP: Deadline: October 2
Tweet media one
2
11
83
@xiao_ted
Ted Xiao
2 years
Introducing RT-2, representing the culmination of two trends:. - Tokenize and train everything together: web-scale text, images, and robot data.- VLMs are not just representations, big VLMs *are policies*. Sounds subtle, but we’ll look back on this as an inflection point! 📈.
@GoogleDeepMind
Google DeepMind
2 years
Today, we announced 𝗥𝗧-𝟮: a first of its kind vision-language-action model to control robots. 🤖. It learns from both web and robotics data and translates this knowledge into generalised instructions. Find out more:
1
17
82
@xiao_ted
Ted Xiao
1 year
Excited to share that 4 of our recent works will appear at #ICRA 2024 in Yokohama!. (1) RT-X: Open X-Embodiment and (2) GenGap expand and study large generalist datasets. (3) PromptBook and (4) PG-VLM improve the robotic reasoning capabilities of frontier LLMs and VLMs. 🧵👇.
5
13
82
@xiao_ted
Ted Xiao
2 years
Looking forward to showcasing one of the first foundation models for robotics at #RSS2023 next week!. Presenting "RT-1: Robotics Transformer for Real-world Control at Scale" from the Google DeepMind robotics team. Website: Session: Tuesday 7/12, 3PM-5PM
1
20
79
@xiao_ted
Ted Xiao
1 year
Update: Our recent work "Manipulation of Open-World Objects" (MOO) has been accepted to #CoRL2023!. Extremely simple object-centric representations (literally just a single pixel) can significantly boost robot generalization!. Check out the original thread + new updates below⬇️
@xiao_ted
Ted Xiao
2 years
Robot learning systems have scaled to many complex tasks, but generalization to novel objects is still challenging. Can we leverage pre-trained VLMs to expand the open-world capabilities of robotic manipulation policies?. Introducing MOO: Manipulation of Open-World Objects!🐮🧵👇
Tweet media one
2
20
78
@xiao_ted
Ted Xiao
11 months
The secret ingredient behind robot foundation models has been compositional generalization, which enables positive transfer across different axes of generalization. I'm excited to share our new project where we use compositional generalization to inform data collection! 🧵⬇️
@jensen_gao
Jensen Gao
11 months
How should we efficiently collect robot data for generalization? We propose data collection procedures guided by the abilities of policies to compose environmental factors in their data. Policies trained with data from our procedures can transfer to entirely new settings. (1/8)
1
10
75
@xiao_ted
Ted Xiao
2 years
@DrJimFan Agree 100%. Feels extremely lucky to be a witness and participant in this era of history.
@xiao_ted
Ted Xiao
2 years
Our grandchildren will look back at our timeline as a step function - a binary before and after of amazing technologies. But I’m so grateful to be part of this most magical time period where we are living through and defining history. Cheers to 2022 and an even better 2023! 🎊
Tweet media one
3
4
75
@xiao_ted
Ted Xiao
9 months
And that’s a wrap for a fantastic week of robots, AI, and sushi at #ICRA2024! . Great to catch up with old friends and meet new ones. And as always, a fun time spreading the gospel of big data and generalist models 🚀
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
4
77
@xiao_ted
Ted Xiao
1 year
Announcing the 2nd Workshop on Language and Robot Learning at #CoRL2023 on November 6th, 2023! This year's theme is "Language as Grounding". Featuring a great speaker lineup and two panels!. Website: CfP: Deadline: October 1
Tweet media one
2
12
74
@xiao_ted
Ted Xiao
9 months
Great debate today at #ICRA2024 on “Generative AI will make a lot of traditional robotics approaches obsolete"! . But I suspect 57% of the room will be very shocked/unhappy over the next 5 years 🙃
Tweet media one
9
14
74
@xiao_ted
Ted Xiao
2 years
Congrats to Ding Liren for becoming the World Chess Champion! 👑♟️. What an exhilarating series and an amazing Game 4 of the Rapid Tiebreakers, where Ding’s mental game shined through. Well deserved victory!.
@chesscom
Chess.com
2 years
Ding Liren wins the 2023 FIDE World Championship 🏆. Congratulations to Ding on becoming the new FIDE World Champion, and cementing his place in chess history after a thrilling match! 👏
Tweet media one
1
8
70
@xiao_ted
Ted Xiao
1 year
It's clear now that the "one robot to many robots" revolution is going to be momentous for scaling smart robots. Happy to fill in the next milestone on the "paradigm shift" timeline!. Open X-Embodiment + RT-X:.Website: Arxiv:
Tweet media one
1
10
72
@xiao_ted
Ted Xiao
2 years
Robotics is the answer to the not-so-secret that many generative modeling domains have saturated existing public datasets (HQ art, code, even some types of text). If you need to keep growing # tokens for chinchilla-optimal scaling, then interaction (robotics!) gets attractive.
6
5
70
@xiao_ted
Ted Xiao
9 months
Robotics progress is unbelievably fast these days🚀. Excited to share a few items on my agenda this week at a jam-packed #ICRA2024, covering numerous works exploring the intersection of foundation models and robotics. 🧵👇
Tweet media one
1
11
72
@xiao_ted
Ted Xiao
2 years
Wow, 4 years later, the MineRL Diamond challenge has been solved *without demonstrations* 🤯! In 2019, the best solutions used many priors and demos but still couldn’t solve the task. In 2022, VPT was the 1st method to collect diamonds, but with demos + IDM. Congrats DreamerV3!.
@GoogleDeepMind
Google DeepMind
2 years
Introducing DreamerV3: the first general algorithm to collect diamonds in Minecraft from scratch - solving an important challenge in AI. 💎. It learns to master many domains without tuning, making reinforcement learning broadly applicable. Find out more:
1
17
69
@xiao_ted
Ted Xiao
11 months
Amazing unveil from @DrJimFan and @yukez: a cross-embodiment *humanoid* foundation model project in just three months (!!). Exciting to see long-term seemingly unrelated bets by NVIDIA pay off: extensive sim2real, multimodal robot policies, and top-tier accelerators! 👏.
@DrJimFan
Jim Fan
11 months
Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning. The GR00T model will enable a robot to understand multimodal
4
10
67
@xiao_ted
Ted Xiao
2 months
🚨 New Model Alert! 🚨. Gemini 2.0 Flash is an extremely strong multimodal model which showcases impressive spatial reasoning capabilities: understanding that we live in a 3D world with consistent rules of physics and geometric/semantic relationships. 🌎. Examples below! 🧵
Tweet media one
2
12
69
@xiao_ted
Ted Xiao
11 months
A major debate in robot learning is whether language is only a good modality for abstract high-level semantics but not low-level motion. Our recent work RT-Hierarchy shows that granular *language motions* can go surprisingly far! . Thread below ⬇️.
@suneel_belkhale
Suneel Belkhale
11 months
Is language capable of representing low-level *motions* of a robot?. RT-Hierarchy learns an action hierarchy using motions described in language, like “move arm forward” or “close gripper” to improve policy learning. 📜: 🏠: (1/10)
1
10
64
@xiao_ted
Ted Xiao
1 year
Nice work from Berkeley showing how motion-centric information is useful for robot policies to exhibit cross-embodiment transfer! . Brings together some ideas from coarse egocentric trajectories ( and dense point tracking flow ( 🌊.
@Xingyu2017
Xingyu Lin
1 year
What state representation should robots have? 🤖 I’m thrilled to present an Any-point Trajectory Model (ATM), which models physical motions from videos without additional assumptions and shows significant positive transfer from cross-embodiment human and robot videos! 🧵👇
2
9
66
@xiao_ted
Ted Xiao
1 year
Day 3 of #CoRL2023! We are presenting three works bridging internet-scale foundation models with robotics. See you at Poster Session 4 at 5:15PM!. - Langauge to Reward: - RT2: - MOO:
Tweet media one
0
10
64
@xiao_ted
Ted Xiao
8 months
Very nice new post from @natolambert on the vibe shift in robotics foundation models! .
Tweet media one
3
4
29
@xiao_ted
Ted Xiao
7 months
Interested in how VLMs can *already* understand embodied reasoning without any finetuning? Our method PIVOT ( shows this can work by reasoning about actions as visual annotations! . Check out our poster at #ICML2024 today, at 1:30PM Hall C 4-9 #109!
Tweet media one
5
10
65
@xiao_ted
Ted Xiao
2 years
Generalization is notoriously hard, but especially so in difficult settings like vision-based robot manipulation. Our recent work *GenGap* decomposes this complex problem into different *generalization axes*! (1/5). Website: Paper:
1
11
64
@xiao_ted
Ted Xiao
1 year
To scale robot data collection effectively 🚀, it's clear that we need go beyond the 1 human : 1 robot ratio. Towards this, we introduce AutoRT: leveraging foundation model reasoning and planning for robot orchestration at scale!. Check out the thread from Keerthana below 🧵
@keerthanpg
Keerthana Gopalakrishnan
1 year
In the last two years, large foundation models have proven capable of perceiving and reasoning about the world around us unlocking a key possibility for scaling robotics. We introduce a AutoRT, a framework for orchestrating robotic agents in the wild using foundation models!
0
6
64
@xiao_ted
Ted Xiao
2 years
Our grandchildren will look back at our timeline as a step function - a binary before and after of amazing technologies. But I’m so grateful to be part of this most magical time period where we are living through and defining history. Cheers to 2022 and an even better 2023! 🎊
Tweet media one
0
12
62
@xiao_ted
Ted Xiao
9 months
Scalable evaluation is a major bottleneck for generalist real-world robot policies. Projects like RT-1 and RT-2 required *thousands* of evaluation trials in the real world 😱. In our new work SIMPLER, we evaluate robot policies in simulation to predict real world performance! 👇
@kylehkhsu
Kyle Hsu
9 months
[1/14] Real robot rollouts are the gold standard for evaluating generalist manipulation policies, but is there a less painful way to get good signal for iterating on your design decisions? Let’s take a deep dive on SIMPLER 🧵👇 (or see quoted video)!.
2
10
61
@xiao_ted
Ted Xiao
5 months
Molmo is a very exciting multimodal foundation model release, especially for robotics. The emphasis on pointing data makes it the first open VLM optimized for visual grounding — and you can see this clearly with impressive performance on RealworldQA or OOD robotics perception!
Tweet media one
@ehsanik
Kiana Ehsani
5 months
Try out Molmo on your application! This is a great example by @DJiafei! We have a few videos describing Molmo's different capabilities on our blog! This one is me trying it out on a bunch of tasks and images from RT-X:
0
9
60
@xiao_ted
Ted Xiao
1 year
In contrast, I���m grateful to be part of a growing all-star team at GDM Robotics pushing forwards on the frontier of Embodied AI 🙌. I’m very optimistic about real-world tokens being indispensable for AGI!.
@pmddomingos
Pedro Domingos
1 year
What do Google, Microsoft and OpenAI have in common? They all had robotics projects and gave up on them.
2
3
59
@xiao_ted
Ted Xiao
9 months
Hello Japan🗾🇯🇵, I’m in Yokohoma for #ICRA2024 the next 10 days. Looking forward to sharing some of our team's recent work on robotics + AI. Say hi if you're interested in robot learning, scaling, or foundation models!.
2
2
58
@xiao_ted
Ted Xiao
1 year
Can "teachability" be a core LLM capability that can be learned via finetuning?. Introducing Language Model Predictive Control (LMPC): distilling entire in-context teaching sessions via in-weight finetuning improves the teachability of LLMs!. Website: 🧵👇
Tweet media one
@jackyliang42
Jacky Liang
1 year
We can teach LLMs to write better robot code through natural language feedback. But can LLMs remember what they were taught and improve their teachability over time?. Introducing our latest work, Learning to Learn Faster from Human Feedback with Language Model Predictive Control
1
12
57
@xiao_ted
Ted Xiao
5 months
Ilya once said that when super-human world modeling is achieved, robotics would be solved 6 months later: just solve AGI, and then ask the AGI to solve robotics. But is this plausible given current trends? 🤔 . Today’s SOTA frontier models soak up human-centric priors
Tweet media one
6
8
59
@xiao_ted
Ted Xiao
10 months
Emergent RL capabilities have never looked so cute before 🥹. Check out an amazing effort from colleagues on scaling RL + self-play to real world football!.
@GoogleDeepMind
Google DeepMind
10 months
Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? ⚽. We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning. Here’s how. 🧵
0
11
58
@xiao_ted
Ted Xiao
2 years
Q: What happens when you combine LLMs, general robot manipulation policies, and a live remote demo from halfway across the world?. A: The Google DeepMind demo at #RSS2023 on Tuesday. Join us at the 2:30PM Demo Session on 7/11!. w/ @andyzeng_ @hausman_k @brian_ichter and🤖
Tweet media one
3
7
58
@xiao_ted
Ted Xiao
1 year
Tons of work left on the path to Embodied AI, but the puzzle pieces are coming together: improving the physical intelligence of foundation models, scaling up data collection systems, and focusing on more diverse + general capabilities. Quite a few new works coming out soon 👀.
2
6
56
@xiao_ted
Ted Xiao
7 months
Robotics will increasingly resemble the field of foundation models: scaling, pretraining, and post-training generalist models. But the opposite is also true. Challenges in large-scale foundation modeling have started looking more and more like perennial problems in robotics.
Tweet media one
2
6
56
@xiao_ted
Ted Xiao
3 months
VLMs can express their temporal and semantic knowledge of robotics via *predicting value functions of robot videos*! Check out our recent work on leveraging VLMs like Gemini for Generative Value Learning 🧵👇.
@JasonMa2020
Jason Ma
3 months
Excited to finally share Generative Value Learning (GVL), my @GoogleDeepMind project on extracting universal value functions from long-context VLMs via in-context learning!. We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+
0
7
57
@xiao_ted
Ted Xiao
1 year
The major thing going for IL is that it turns open-ended research risks (RL tuning, GAIL instabilities, reward shaping) into a data engineering problem, which is much more tractable and easier to make consistent progress on. Intellectually disappointing but effective 🫠.
@EugeneVinitsky
Eugene Vinitsky 🍒
1 year
Huge fraction of all written text as its training data and GPT-4 still makes basic reasoning mistakes; kinda makes you suspicious about imitation learning as a robust approach in other settings e.g. driving, robotics.
4
3
54
@xiao_ted
Ted Xiao
1 year
Had a great time today with @YevgenChebotar and @QuanVng visiting @USCViterbi to give a talk on “Robot Learning in the Era of Foundation Models”. Slides out soon, packed with works from *just the past 5 months* 🤯. Thanks to @daniel_t_seita for hosting!
Tweet media one
1
3
57
@xiao_ted
Ted Xiao
1 year
@DrJimFan Nice detective work! Agree with architectural guesses (1), (3), (5). I think (2) and (4) are a bit less obvious; very high DoF l + long horizon + high frequency control means that standard decisions decisions in manipulation may not be enough. But video-level tokenization is hard.
1
2
56
@xiao_ted
Ted Xiao
4 months
Cool to see the long-awaited Tesla humanoid update! Nice progress on a new high-DoF hand and better locomotion. It’s true that most “mind-blowing” behaviors yesterday were teleoperated, and many researchers are understandably not happy about the marketing first approach. But
Tweet media one
5
6
56
@xiao_ted
Ted Xiao
12 days
Excited to share that 3 works exploring the frontier of foundation models + robotics are accepted to ICRA and ICLR 2025!. - RT-Affordance: visual affordances for VLA training.- STEER: how VLMs can orchestrate control via dense language.- GVL: In-context Value Functions. Links👇.
1
7
57
@xiao_ted
Ted Xiao
2 years
Excited to showcase how generative models can be used for semantically relevant image augmentation for robotics. ROSIE uses a diffusion model to produce semantically relevant visual augmentations on existing datasets -- unlocking new skills and more robust policies 🖌️🎨.
@xf1280
Fei Xia
2 years
Text-to-image generative models, meet robotics! . We present ROSIE: Scaling RObot Learning with Semantically Imagined Experience, where we augment real robotics data with semantically imagined scenarios for downstream manipulation learning. Website: 🧵👇
1
7
51
@xiao_ted
Ted Xiao
5 months
How can we connect advances in video generation foundation models to low-level robot actions? We propose a simple but powerful idea: condition robot policies directly on generated human videos!. Video Generation 🤝 Robot Actions. Check out Homanga’s thread:.
@mangahomanga
Homanga Bharadhwaj
5 months
Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset!. 1/n
1
9
54
@xiao_ted
Ted Xiao
11 months
Large scale cross-embodied language-conditioned agents in a variety of video game domains! What’s particularly exciting is seeing positive transfer: generalist agents outperform specialist agents.
@GoogleDeepMind
Google DeepMind
11 months
Introducing SIMA: the first generalist AI agent to follow natural-language instructions in a broad range of 3D virtual environments and video games. 🕹️. It can complete tasks similar to a human, and outperforms an agent trained in just one setting. 🧵
0
4
50
@xiao_ted
Ted Xiao
1 year
Announcing our recent work RT-Sketch!. ❌Goal images contain a lot of useful information, but perhaps too much .❌Language instructions may not provide enough information .✅Goal *sketches* focus on the important details, and are easy to specify!. Checkout the thread:
@priyasun_
Priya Sundaresan
1 year
We can tell our robots what we want them to do, but language can be underspecified. Goal images are worth 1,000 words, but can be overspecified. Hand-drawn sketches are a happy medium for communicating goals to robots!. 🤖✏️Introducing RT-Sketch: 🧵1/11
2
7
51
@xiao_ted
Ted Xiao
2 years
6) This is the highest praise possible if you’re in ML research. This is the peak 😎
Tweet media one
15
3
46
@xiao_ted
Ted Xiao
20 days
A lot of interesting discussion recently on the implications of OpenAI being a major sponsor of the FrontierMath benchmark. There are two extremes of what this implies:.1) omg OAI is benchmark hacking and leaking test data into training! .2) OAI uses questions in the private
Tweet media one
6
7
51