We recently launched as a community-driven effort to pool UMI-related data together. 🦾
If you are using a UMI-like system, please consider adding your data here. 🤩🤝
No dataset is too small; small data WILL add up!📈
The Internet is too fast, I’m still crafting my catchy twits, and word is already out😂 Well then, now you have it:
RoboNinja🥷: Learning an Adaptive Cutting Policy for Multi-Material Objects
🧵👇 for a few interesting details you might have missed
How to precisely swing an *unknown* rope to hit a target? It is a challenging task even for us due to complex system dynamics - introduced by object deformation and high-speed dynamic actions.
Iterative Residule Policy () is our attempt, details🧵⬇️1/n
Check out UMI! 3 things I learned in this project:
1. Wrist-mount cameras can be sufficient for challenging manipulation tasks with the right hardware design.
2. Cross-embodiment policy is possible with the right policy interface.
3. BC can generalize if the data is right.
Can we collect robot data without any robots?
Introducing Universal Manipulation Interface (UMI)
An open-source $400 system from
@Stanford
designed to democratize robot data collection
0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
More robots do not always lead to higher productivity if they don’t collaborate ;) Check out our latest work in
#CORL2020
.
Despite being trained on 1-4 arms static task, the system generalizes to 5-10 arms with dynamic targets
w/ Huy Ha, Jingxi xu
Check out UMI! 3 things I learned in this project:
1. Wrist-mount cameras can be sufficient for challenging manipulation tasks with the right hardware design.
2. Cross-embodiment policy is possible with the right policy interface.
3. BC can generalize if the data is right.
Diffusion Policy for robots!
The most impressive thing to me is how fast we can deploy a new skill with this framework -- and we just keep adding more and more.
Cheng has made the framework really easy to use, so you try it out too. Colab & Github:
What if the form of visuomotor policy has been the bottleneck for robotic manipulation all along? Diffusion Policy achieves 46.9% improvement vs prior StoA on 11 tasks from 4 benchmarks + 4 real world tasks! (1/7)
website :
paper:
Montessori busy boards for robots! We're open-sourcing a toy-inspired robot learning environment for developing essential interaction, reasoning, and planning skills.
Let's give our robot toddlers toys to play with before asking them for help in the kitchen ;) (1/n)
Universal Manipulation Policy Network – a single policy learns to manipulate a diverse set of articulated objects (e.g., fridge, laptop, or drawers) regardless of their joint types or # links.
w.
@zhenjia
@zhanpeng_he
Things we learned 🧵⬇️1/n
Congratulations to the 2021 Microsoft Research Faculty Fellows! This fellowship recognizes innovative, promising new faculty whose exceptional talent for innovation identifies them as emerging leaders in their fields. Learn about their research interests:
Excited to receive the NSF CAREER award! I'm grateful to all my students
@CAIRLab
, mentors, and collaborators for making this possible 😊
and thank you, Holly and Bernadette, for writing this nice article that summarizes our research. 🤖
Embodiment is such a critical component of Embodiment Intelligence but often gets overlooked.
Can robots learn to generate different embodiment (i.e., hardware designs) for different tasks that drastically simplify perception, planning, and control?
Check it out ⬇️
Can we automate task-specific mechanical design without task-specific training?
Introducing Dynamics-Guided Diffusion Model for Robot Manipulator Design, a data-driven framework for generating manipulator geometry designs for given manipulation tasks.
w. Huy Ha,
@SongShuran
Dynamic manipulation turns out to be so much more effective for cloth unfolding! Check out FlingBot -- unfold your shirt in 3 steps! 😉
Code for both simulation & real robots is available!
#CORL2021
w/ Huy Ha
Have to share these epic fails ...
"We've broken 3 legs, fried 1 Jetson, and ripped one pair of pants, so you don't have to" 😅
Check here for details: 😉
Real2Code -- translating real-world articulated objects to sim using code generation! With the code representation, this method scales well wrt the number of object parts, check out the 10-drawer table it reconstructed 😉
Don't want to collect hundreds of demonstrations for every object and scenario? Check out EquiBot form
@yjy0625
--- Leveraging equivariance in diffusion policy to make it sample-efficient and generalizable!
Want a robot that learns household tasks by watching you?
EquiBot is a ✨ generalizable and 🚰 data-efficient method for visuomotor policy learning, robust to changes in object shapes, lighting, and scene makeup, even from just 5 mins of human videos.
🧵↓
By plugging a $5 contact microphone 🎤into UMI, we can now "hear" 👂all the critical contact events during manipulation and "feel" ☝️the subtle differences on the contact surface.
Check out
@Liu_Zeyi_
's new work on ManiWav: Manipulation from In-the-Wild Audio-Visual Data!
🔊 Audio signals contain rich information about daily interactions. Can our robots learn from videos with sound?
Introducing ManiWAV, a robotic system that learns contact-rich manipulation skills from in-the-wild audio-visual data. See thread for more details (1/4) 👇
Struggling with your 2D visual predictive models that keep losing track of objects? Time to try out this 3D dynamic scene representation (DSR)
at
#CORL2020
.
w. zhenjia_xu
@zhanpeng_he
@jiajunwu_cs
One of the common questions I get for UMI is how to apply it to mobile robots, eps when we don't have a precise IK solver. Check out UMI-on-legs!
With a manipulation-centric whole-body controller, we can put any UMI skills on a legged robot🐕
Video:
I’ve been training dogs since middle school. It’s about time I train robot dogs too 😛
Introducing, UMI on Legs, an approach for scaling manipulation skills on robot dogs🐶It can toss, push heavy weights, and make your ~existing~ visuo-motor policies mobile!
This robot is having a lot of fun!
Check out
@ruoshi_liu
's PaperBot, a robot that learns to design, fold, and throw a paper airplane 😊✈️, and many other things!
Humans can design tools to solve various real-world tasks, and so should embodied agents. We introduce PaperBot, a framework for learning to create and utilize paper-based tools directly in the real world.
Introducing 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Hardware!
A low-cost, open-source, mobile manipulator.
One of the most high-effort projects in my past 5yrs! Not possible without co-lead
@zipengfu
and
@chelseabfinn
.
At the end, what's better than cooking yourself a meal with the 🤖🧑🍳
Position control can only go so far. For contact-rich tasks, robots must master both position and force – that’s where compliance comes in!
But what’s the right compliance? 🤔Hint: being always compliant in all directions won’t cut it.
Check out
@YifanHou2
’s solution 😉⤵️
Can robots learn to manipulate with both care and precision? Introducing Adaptive Compliance Policy, a framework to dynamically adjust robot compliance both spatially and temporally for given manipulation tasks from human demonstrations. Full detail at
What if your robot hand suddenly lost a finger? 🤕🤖
Wouldn’t it be great if the same policy could still be effective?
Check out "Get-Zero"— by representing the embodiment as a directed grasp, the single trained policy can generalize across new designs without retraining 🪄
What if you could control new hand designs without a new policy?
Introducing GET-Zero, an embodiment-aware policy that can zero-shot control a wide range of hand designs with a single set of network weights.
UMI's pretrained weight is released.
We have tested the policy on three different robots: UR5, Franka, and ARX. Time to try it on your robot !!
Buy any "espresso cup with saucer" on Amazon, and it should work -- or let
@chichengcc
know if it doesn't 😉
Weights drop ⚠️
We released our pre-trained model for the cup arrangement task trained on 1400 demos! We aim to enable anyone to deploy UMI on their robot to arrange any "espresso cup with saucer" they buy on Amazon.
Amazing work on collaborative cooking. The interaction between human and robots is so natural and smooth, see the subtle things like how the robot is pausing and waiting for the human to pour soup, very impressive!
Cooking in kitchens is fun. BUT doing it collaboratively with two robots is even more satisfying!
We introduce MOSAIC, a modular framework that coordinates multiple robots to closely collaborate and cook with humans via natural language interaction and a repository of skills.
One grasping policy for many and (new!) grippers. Code is available here: [](). Try it out, and let us know if your favorite gripper is missing!
w. Zhenjia, Beichun,
@submagr
#RSS2022
is happening next week
@Columbia
!
@chichengcc
and
@Zhenjia_Xu
are presenting Iterative Residual Policy
and DextAIRity
Join us for a tour of our lab on Thursday! Our robots are getting dressed for demos 😜
@haqhuy
’s new project:
*Scaling up* robot data collection using LLM for ✅ task decomposition ✅ reward formulation
*Distill down* into visuomotor policies that ✅ operate from raw sensory input ✅ improve overtime.
Check out the engaging Q&A here 😉
Also, forget to mention, UMI is always evolving! If you're adding new sensors or making hardware tweaks, please share it as well! 🙌
Even when the data is not directly transferable to the current UMI, it can still power pretraining or other creative applications. 🚀
We recently launched as a community-driven effort to pool UMI-related data together. 🦾
If you are using a UMI-like system, please consider adding your data here. 🤩🤝
No dataset is too small; small data WILL add up!📈
Just like us humans, failures are inevitable for robots as well and it is important to "REFLECT" on them!
Check out
@Liu_Zeyi_
and
@ArpitBahety
's new project on failure reasonings for robots.
The new dataset (RoboFail) and code are out too!
🤖 Can robots reason about their mistakes by reflecting on past experiences?
(1/n) We introduce REFLECT, a framework that leverages Large Language Models for robot failure explanation and correction, based on a summary of multi-sensory data. See below for details and links👇
Can robots learn how to improve their tools (i.e., grippers) to better accomplish a given task? Check out our work “Fit2Form: 3D Generative Model for Robot Gripper Form Design.” at
#CORL2020
w. Huy Ha,
@submagr
Deformable objects are common in household, industrial and healthcare settings. Tracking them would unlock many applications in robotics, gen-AI, and AR.
How? Check out MD-Splatting: a method for dense 3D tracking and dynamic novel view synthesis on deformable cloths. 1/6🧵
From our demo floor at AI@, check out Code as Policies at work. This helper robot is able to compute and execute a task given via natural language. Read more →
🚨 🚨 Another new work showcasing bitter lesson 2.0 🚨 🚨
Introducing MOO:
We leverage vision-language models (VLMs) to allow robots to manipulate objects they've never interacted with, and in new environments, while learning end-to-end policies. 🧵
#CVPR2022
We are looking volunteers from the CVPR community (graduate students, university faculty, and researchers) to help us organize *in-person* outreach events!
Excited to share our latest progress on legged manipulation with humanoids. We created a VR interface to remote control the Draco-3 robot 🤖, which cooks ramen for hungry graduate students at night. We can't wait for the day it will help us at home in the real world!
#humanoid
Apart from swing rope, IRP is a general formulation that could work for other dynamic manipulation of deformable objects, like swinging a table cloth. 5/n
@haqhuy
We spent a lot of effort on the documentation and hope that people can easily reproduce our work (including hardware!). We disscussed our hardware choices and how we fixed all kinds of harware problems so you don't have to. Please check it out!
TOMORROW: Spring 2022 GRASP SFI: Shuran Song,
Shuran Song,(
@SongShuran
) Columbia University, “The Reasonable Effectiveness of Dynamic Manipulation for Deformable Objects” 3/16 @ 3:00 - 4:00pm - Levine 512 & Zoom. See you there!
RL gets specific to the robot it is trained on. Can a policy be trained to control many agents?
Turns out, training (shared) policy for each motor instead of whole robot not only achieves SOTA at train but also transfers to unseen agents w/o fine-tuning!
Stress test on robustness-- we interrupt the system by randomly tying a few knots on the rope after the policy converges on a given goal. Thanks to its iterative formulation, IRP can quickly adapt and regain good performance. 4/n
Can audio help robots perform tasks better?
ManiWAV is a framework for leveraging contact audio to improve performance in various robotic tasks. It consists of a modified Universal Manipulation Interface, an audio augmentation strategy, and a neural network architecture that
2/5 It is important to have an adaptive policy for handling out-of-distribution scenarios. Otherwise, the policy can easily get stuck. See the comparison below with the non-adaptive policy.
I had the pleasure of speaking at
@Columbia
’s vision seminar, kindly hosted by
@SongShuran
,
@sy_gadre
.
My talk focused on using Transformer explainability algorithms to improve performance of downstream tasks (e.g. image editing).
Check it out :)
3/5 Since we only need the binary contact information, it is possible to implement the real-world system with a low-cost weight sensor ($10) instead of an expensive force-torque sensor ($5000+). Though a precise force sensor could potentially enable more complex behaviors.
4/n Performing large-scale real-world training is still very challenging. While the policy only takes images as input, the reward used for training still uses the joint state, which is can be hard to get from real-world videos ☹️ Ideas are welcome!
The policy is trained only in simulation and directly tested with different real-world ropes. Despite the large sim2real gaps, IRP can adjust its action based on visual feedback. 3/n
Instead of learning the direct mapping from action to trajectory, IRP learns to predict the effects of a delta action on the previously observed trajectory -- swing faster will reach higher. And use the prediction to adjust its action and get closer to the goal iteratively. 2/n
Not only is this amazing research, it has all the elements of a great robotics (manipulation) paper that I wish was common practice in the field.
Quick Thread:
3/n In fact, interactions could help in understanding objects’ underlying structure. Fig ⬇️ shows the joint parameters inferred from actions. While the algorithm has never been supervised on joint parameters, it is able to estimate them for both revolute and prismatic joints.
2/n Manipulation strategy is surprisingly generalizable across object categories IF the policy learns the right thing. Oftentimes it is not necessary to perform explicit pose estimation or part segmentation to perform effective manipulations.
BusyBoard is procedurally generated using diverse objects with inter-object functional relation pairs. The skills learned from BusyBoards can be applied to real-world objects (2/n)
As a volunteer, you can help either (1) guide local high school students to tour our expo and demo exhibition or (2) provide mentorship of college student(s) from local HBCU/MI institutes during their participation at CVPR.