abhishekunique7 Profile Banner
Abhishek Gupta Profile
Abhishek Gupta

@abhishekunique7

Followers
7K
Following
721
Media
135
Statuses
417

Assistant Professor at University of Washington. I like robots, and reinforcement learning. Previously: post-doc at MIT, PhD at Berkeley

Seattle, WA
Joined February 2012
Don't wanna be here? Send us removal request.
@abhishekunique7
Abhishek Gupta
4 years
Thrilled to share that I will be starting as an assistant professor at the University of Washington @uwcse in Fall 2022! Grateful for wonderful mentors and collaborators at @berkeley_ai, especially @svlevine and @pabbeel. Looking forward to joining the wonderful folks @uwcse!.
33
25
463
@abhishekunique7
Abhishek Gupta
1 year
Anyone who knows me knows I love real world RL :) But anyone who works on real-world RL knows it’s quite a pain to get going. We tried to make everyone’s life easier by writing a software suite to get you going with real world RL out of the box, without all the pain! A 🧵(1/5)
3
32
253
@abhishekunique7
Abhishek Gupta
8 months
So I hear that behavior cloning is all the rage now. What if we could do better, but with the same data? :) In CCIL, we show that imitation via BC is improved by synthesizing corrective labels to account for compounding error, without interactive oracles. Lets you do 👇! 🧵(1/9)
5
39
255
@abhishekunique7
Abhishek Gupta
2 years
I am recruiting PhD students to join us in the Washington Embodied Intelligence and Robotic Development Lab (WEIRD) at @uwcse. We work on robot learning, especially RL in the real world! Check out for details (1/3).
7
44
218
@abhishekunique7
Abhishek Gupta
1 year
Imagine this: you drop your robot in an environment, connect it to the internet and come back 10 hours later, and it has learned to solve tasks in the real world, autonomously, with no effort from you! We enable this in our work -Guided Exploration for Autonomous RL (GEAR)🧵(1/5)
3
23
180
@abhishekunique7
Abhishek Gupta
2 years
Excited to share our work on uncertainty estimation using diffusion/score matching! The idea is simple: offline optimization (eg model-based RL, imitation) require us to estimate uncertainty. Estimating uncertainty is hard - score matching provides a scalable solution. A🧵(1/5)
Tweet media one
2
17
172
@abhishekunique7
Abhishek Gupta
2 years
Excited to share our work on self-supervised RL by modeling random features. The key premise behind RaMP is to learn about environment dynamics, without learning a dynamics model! This allows for transfer, without accruing compounding error. A🧵 (1/6).
1
30
160
@abhishekunique7
Abhishek Gupta
1 year
Intrigued by decision transformers, we investigated why and when we should use return-conditioned RL as an alternative to dynamic prog (DP). Our findings are neat! With data coverage, RCSL can outperform DP, but fail to "stitch" trajectories. We analyze and propose a fix. 🧵(1/N)
4
21
164
@abhishekunique7
Abhishek Gupta
9 months
Excited about @ZoeyC17's new work on real2sim for robotics! We present URDFormer, a technique to learn models that go from RGB images to full articulated scene URDFs in sim by "inverting" pre-trained generative models. These can be used to train robots for the real-world! 🧵(1/8)
2
23
144
@abhishekunique7
Abhishek Gupta
10 months
So you want to do robotics tasks requiring dynamics information in the real world, but you don’t want the pain of real-world RL? In our work to be presented as an oral at ICLR 2024, @memmelma showed how we can do this via a real-to-sim-to-real policy learning approach. A 🧵 (1/7)
1
25
137
@abhishekunique7
Abhishek Gupta
2 months
Haven't been to a conference in a while, really excited to be at #NeurIPS2024! I'll be helping present 4 of our group's recent papers: .1. Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL 2. Distributional
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
16
124
@abhishekunique7
Abhishek Gupta
3 years
Excited to share our work on reset-free fine-tuning bootstrapped by offline data. We show results in a real-world kitchen, with a robot practicing autonomously to improve for over a day with minimal intervention! .Paper: Website:
5
21
123
@abhishekunique7
Abhishek Gupta
5 years
Reinforcement learning can be significantly accelerated by using offline datasets with a simple, but carefully designed actor critic algorithm! Solves dexterous manipulation tasks in <1 hour. @ashvinair M Dalal @svlevine.
0
22
118
@abhishekunique7
Abhishek Gupta
2 years
“How can you enable your parents to train your robot?”.We propose a system for enabling robot learning by hooking up a robot to the web, using noisy, occasional feedback from non-experts to guide exploration. Enables robot learning in sim and real w/out reward engineering!🧵(1/8)
2
21
115
@abhishekunique7
Abhishek Gupta
2 months
In my experience, robot 'generalists' are often jacks of all trades but masters of none. In training across multiple tasks and environments, robot policies fail to generalize robustly and effectively to each particular test setting. What if at test time, we non-parametrically
2
22
117
@abhishekunique7
Abhishek Gupta
11 months
Robot learning in the real world can be expensive and unsafe in human-centric environments. Solution: Construct simulation on the fly and train in it!.Excited to share RialTo, led by @marceltornev on learning resilient policies via real-to-sim-to-real policy learning! A 🧵 (1/12)
2
22
115
@abhishekunique7
Abhishek Gupta
3 years
Excited to be working with all these amazing people very soon! Exciting times ahead😀 On that note I'm also hoping to recruit students this cycle to start in Fall 22. If you like ML and robotics and want to get things to work in the real world, definitely apply to UW!, 1/3.
@uwcse
Allen School
3 years
Not even a pandemic could slow down #UWAllen faculty hiring. Over the past 2 cycles, we welcomed 15 (yes—15!) outstanding researchers and educators who have joined/will soon join us at @uwengineering @UW Seattle. Meet these new members of our community:
Tweet media one
1
9
111
@abhishekunique7
Abhishek Gupta
9 months
Who doesn’t love good methods for reward inference. What if I told you that you could extract dense rewards from video, by ranking frames temporally using the BT model from RLHF (aka just doing temporal classification with cross-entropy). Let's see how, in rank2reward - a🧵(1/10)
2
19
104
@abhishekunique7
Abhishek Gupta
2 years
New work from my time at MIT!.We introduce Distributionally Adaptive Meta-Reinforcement Learning (DiAMetR) - Meta-RL struggles when test-tasks are OOD, which arguably is most of the time! We propose an algorithm resilient to distribution shift. 🧵 (1/N)
1
13
100
@abhishekunique7
Abhishek Gupta
1 year
Want to get model-based RL to work in diverse, dynamic scenes? Check out @chuning_zhu's latest work (RePo) on model-based reinforcement learning without reconstruction, where we show how to learn world models that scale to dynamic, multi-task environments. A 🧵(1/6)
5
17
94
@abhishekunique7
Abhishek Gupta
2 months
So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been
1
22
86
@abhishekunique7
Abhishek Gupta
2 years
I'm truly so tired of reading reviews about "novelty". What does that even mean. #ICML2023.
3
3
81
@abhishekunique7
Abhishek Gupta
1 year
Most offline RL methods try to constrain policies from deviating far from the offline data distribution. In cases where the data distribution is imbalanced or suboptimal, this makes it hard to actually learn good behavior! In new work, @ZhangWeiHong9 proposes a solution 🧵 (1/5)
4
9
80
@abhishekunique7
Abhishek Gupta
2 months
Over the last year, we’ve been investigating how simulation can be a useful tool for real-world reinforcement learning on a robot. While simulation captures inherently incorrect dynamics, it can still be useful for real-world learning! In our #NeurIPS2024 work, Andrew W.
Tweet media one
3
18
76
@abhishekunique7
Abhishek Gupta
4 years
We've been working on getting robots to learn in the real world with many hours of autonomous reset free RL! Key idea is to leverage multi-task RL to enable scalable learning with no human intervention. Allows learning of cool dexterous manipulation tasks in the real world!.
@svlevine
Sergey Levine
4 years
After over a year of development, we're finally releasing our work on real-world dexterous manipulation: MTRF. MTRF learns complex dexterous manipulation skills *directly in the real world* via continuous and fully autonomous trial-and-error learning. Thread below ->
1
11
68
@abhishekunique7
Abhishek Gupta
5 years
Sharing two recent talks from my advisor @svlevine covering much of my recent work, as well as work from many of my colleagues. I really enjoyed watching these, they give a really cool perspective on frontiers of RL.
3
12
61
@abhishekunique7
Abhishek Gupta
4 years
New work on learning how to grasp and navigate with mobile robots using RL. What I find very exciting is the ability of the system to be trained for > 60 hrs with minimal intervention, learning in diverse scenarios. Paper: .Website:
4
8
58
@abhishekunique7
Abhishek Gupta
4 years
I did a podcast thing! Here's a recent interview on Applying RL to Real-World Robotics with @samcharrington for the @twimlai podcast. Check it out! via @twimlai.
0
9
59
@abhishekunique7
Abhishek Gupta
2 years
Excited to share the first of several papers toward leveraging generative models as data sources for RL! RL sees minimal data, gen models see lots of data. We show that gen models (here LLMs) can provide background info for RL common sense (here exploration)! Thread by @d_yuqing!.
@d_yuqing
Yuqing Du
2 years
How can we encourage RL agents to explore human-meaningful behaviors *without* a human in the loop?. @OliviaGWatkins2 and I are excited to share “Guiding Pretraining in Reinforcement Learning with LLMs”! . 📜🧵1/
1
5
56
@abhishekunique7
Abhishek Gupta
5 years
Fun blog post on our work on unsupervised meta-reinforcement learning, for doing meta-reinforcement learning without explicit human provided task distributions!. And associated paper
0
19
47
@abhishekunique7
Abhishek Gupta
3 months
How can we enable transferable decision-making for *any* reward zero-shot? MBRL is task-agnostic but suffers from compounding error, while MFRL is task-specific. We propose a new class of world models that transfers across tasks zero-shot and avoids compounding error! A 🧵 (1/9)
1
11
52
@abhishekunique7
Abhishek Gupta
2 years
Excited about our work on understanding the benefits of reward shaping! Reward shaping is critical in a large portion of practical RL problems and this paper tries to understand when and why it helps. Terrific collaboration with @aldopacchiano Simon Zhai @svlevine @ShamKakade6!.
@svlevine
Sergey Levine
2 years
In theory RL is intractable w/o exploration bonuses. In practice, we rarely use them. What's up with that? Critical to practical RL is reward shaping, but there is little theory about it. Our new paper analyzes sample complexity w/ shaped rewards: Thread:
Tweet media one
0
5
51
@abhishekunique7
Abhishek Gupta
6 months
While investigating RLHF methods last year, @sriyash__ and @yanming_wan noted that human annotators in a population often display diverse and conflicting preferences. While typical RLHF methods struggle with this diversity, we developed new techniques for plurastic RLHF! 🧵(1/7)
1
10
46
@abhishekunique7
Abhishek Gupta
3 years
Tried to share some tips on faculty applications, do take a listen if you're thinking of applying. Hope it can be helpful! Thanks for having me @talkingrobotics!.
@talkingrobotics
Talking Robotics
3 years
"Start writing your research statement in the summer." Abhishek Gupta provided the BEST ADVICE if you are preparing for the #academic #job #market. This talk has TONS OF TIPS from his own experience in the job market last year. Listen now (links below). @uw @uw_robotics @uwcse
Tweet media one
2
4
45
@abhishekunique7
Abhishek Gupta
5 years
Presenting "Ingredients of Real World Robotic RL" at ICLR 2020, 4/26 10pm-12am PST & 4/27 5am-7am PST. Blog: Paper: Descriptive Video: .Poster livestream:
1
12
37
@abhishekunique7
Abhishek Gupta
11 months
Excited to share a new large-scale dataset for in-the-wild robotic learning! It was an honestly eye-opening experience for our whole group to be a part of this. Thanks to @SashaKhazatsky, @KarlPertsch and the rest of the team for putting together an amazing dataset! 🤖.
@SashaKhazatsky
Alexander Khazatsky
11 months
After two years, it is my pleasure to introduce “DROID: A Large-Scale In-the-Wild Robot Manipulation Dataset”. DROID is the most diverse robotic interaction dataset ever released, including 385 hours of data collected across 564 diverse scenes in real-world households and offices
1
1
39
@abhishekunique7
Abhishek Gupta
2 years
I’m very very excited about led by @avivnet at #ICLR2023 on learning deep control policies that can extrapolate using a transductive approach. We show how we can get neural network policies to extrapolate without significant domain-specific assumptions. A 🧵 to explain how: (1/6).
1
3
37
@abhishekunique7
Abhishek Gupta
4 years
Some cool new updated results for offline pre-training followed by online fine-tuning with AWAC (advantage-weighted actor-critic). Offline RL does cool things on robots!.
@svlevine
Sergey Levine
4 years
How can we get robots to solve complex tasks with RL? Pretrain with *offline* RL using prior data, and then finetune with *online* RL! In our updated paper on AWAC (advantage-weighted actor-critic), we describe a new set of robot experiments: thread ->
0
6
36
@abhishekunique7
Abhishek Gupta
3 years
Excited to share our work on benchmarking reset free RL. We hope this presents a way to go beyond the standard episodic assumptions made in robotic RL, making it practical for the real world!.
@archit_sharma97
Archit Sharma
3 years
Embodied agents such as humans and robots live in a continual non-episodic world. Why do we continue to develop RL algorithms in episodic settings? This discrepancy also presents a practical challenge -- algorithms rely on extrinsic interventions (often humans) to learn .
1
1
35
@abhishekunique7
Abhishek Gupta
3 years
Excited to share a new blog post on our work on learning informative rewards for RL! By considering a more tractable class of outcome driven RL problems and a particular choice of uncertainty aware classifier, we learn more informative reward functions
3
7
31
@abhishekunique7
Abhishek Gupta
11 months
Exciting to see what @pabbeel, Anusha Nagabandi, @clavera_i, @CarlosFlorensa, Nikhil Mishra and other friends at covariant have been up to!.
@CovariantAI
Covariant
11 months
Today, we are introducing RFM-1, our Robotics Foundation Model giving robots human-like reasoning capabilities.
0
5
33
@abhishekunique7
Abhishek Gupta
2 years
I remember when I was first starting to work on dexterous hands, we were thinking about how to find and grasp objects in the dark with touch sensing. Here are our initial attempts at this problem.
1
2
32
@abhishekunique7
Abhishek Gupta
8 days
@harshit_sikchi When one starts to feel the AGI😁.
0
0
32
@abhishekunique7
Abhishek Gupta
6 months
MIT covering some of our work! Led by @marceltornev along with @pulkitology @anthonysimeono_ @taochenshh and others. Give it a read :).
@MIT_CSAIL
MIT CSAIL
6 months
To automate time-consuming tasks like household chores, robots must be precise & robust for very specific environments. With MIT’s “RialTo” method, users can scan their surroundings w/their phone so a robot can practice in a digital twin environment. This novel
0
2
24
@abhishekunique7
Abhishek Gupta
3 years
Yay! Very well deserved @pabbeel!.
@IEEEAwards
IEEE Awards
3 years
Congratulations to @UCBerkeley’s Pieter Abbeel (@pabbeel) on receiving the 2022 @IEEEorg Kiyo Tomiyasu Award, sponsored by the late Dr. Kiyo Tomiyasu, @IEEE_GRSS, and @IEEEMTT, for contributions to #DeepLearning for #Robotics: #IEEEAwards2022 #IEEETFAs
Tweet media one
1
0
24
@abhishekunique7
Abhishek Gupta
9 months
I'm unfortunately not at @iclr_conf, but our group and collaborators are presenting 4 papers this year! Come meet the awesome students presenting this work :) A 🧵 (1/5).
1
1
23
@abhishekunique7
Abhishek Gupta
3 months
Some of our most exciting work on new ways to do world modeling and zero-shot transfer! This work is important in reimagining what a generalizable world model looks like beyond autoregressive prediction. Check out @chuning_zhu's thread for details.
@chuning_zhu
Chuning Zhu
3 months
How can we train RL agents that transfer to any reward? In our @NeurIPSConf paper DiSPO, we propose to learn the distribution of successor features of a stationary dataset, which enables zero-shot transfer to arbitrary rewards without additional training! A thread 🧵(1/9)
0
2
23
@abhishekunique7
Abhishek Gupta
1 year
Check out RoboHive - our new unified robot learning framework, tons of cool new environments, tasks, platforms. We hope this can be a helpful tool for folks in robot learning and beyond!.
@Vikashplus
Vikash Kumar
1 year
📢#𝗥𝗼𝗯𝗼𝗛𝗶𝘃𝗲 - a unified robot learning framework . ✅Designed for genralizn first robot-learning era.✅Diverse (500 envs, 8 domain).✅Single flag for Sim<>Real.✅TeleOper Support.✅Multi-(Skill x Task) realworld dataset.✅pip install robohive. 🧵👇
0
2
21
@abhishekunique7
Abhishek Gupta
2 years
Gave a talk on dirty laundry in RL, ala advice from @Ken_Goldberg. Situated this in some dexterous manipulation work. Recordings should be up soon, y’all might enjoy it :) thanks @notmahi and the other organizers!.
@notmahi
Mahi Shafiullah 🏠🤖
2 years
The first workshop on Learning Dexterous Manipulation at @RoboticsSciSys is starting now! Check out our speaker lineup at or tune in via zoom at if you are not in person.
0
1
22
@abhishekunique7
Abhishek Gupta
1 year
We hope this can be a useful tool to help use RL on your robots! Happy RL-ing. Website: Code: w/ @jianlanluo,@real_ZheyuanHu, Charles Xu, @youliangtan, @archit_sharma97, Stefan Schaal, @chelseabfinn, @svlevine (5/5).
1
1
21
@abhishekunique7
Abhishek Gupta
2 years
Excited to share work led by Max Simchowitz on principled ways to approach combinatorial generalization using bilinear embeddings. Useful under “combinatorial” distribution shift - eg you’ve seen blue mugs, red mugs and blue cups, what happens when you see red cups? A 🧵 (1/3).
1
2
21
@abhishekunique7
Abhishek Gupta
3 months
Max is 100% one of the smartest people I know and a fantastic mentor, go work with him!.
@max_simchowitz
Max Simchowitz
3 months
A very exciting personal update: In January, I’ll be joining @CMUMLD as tenure-track assistant professor! My lab will focus on the mathematical foundations of, and new algorithms, for decision making. This includes everything from reinforcement learning in the physical world
Tweet media one
0
0
20
@abhishekunique7
Abhishek Gupta
3 months
Check out our new work on learning human-AI cooperation agents using generative models. Led by @liangyanchenggg and @Daphne__Chen, to be presented at #NeurIPS2024.The overcooked game in the browser is fun to play :).
@liangyanchenggg
Yancheng Liang
3 months
🎉 Excited to release our #NeurIPS2024 paper on zero-shot human-AI cooperation. For the first time, we use generative models to sample infinite human-like training partners to train a Cooperator agent. 🔥Experience it! 🚀Check out our 𝐥𝐢𝐯𝐞 𝐝𝐞𝐦𝐨 👉
1
5
20
@abhishekunique7
Abhishek Gupta
1 year
If you're at #NeurIPS2023, check out @badsethcohen 's work on generative BC! A cool look into how to realize stability guarantees for imitation learning, in theory and practice .Poster: Thu 14 Dec 10:45 a.m. CST — 12:45 p.m. CST, #1427.Paper:
Tweet media one
0
2
19
@abhishekunique7
Abhishek Gupta
9 months
Real2Sim is great, exciting to see this 👏.
@XuanlinLi2
Xuanlin Li (Simon)
9 months
Scalable, reproducible, and reliable robotic evaluation remains an open challenge, especially in the age of generalist robot foundation models. Can *simulation* effectively predict *real-world* robot policy performance & behavior?. Presenting SIMPLER!👇.
0
4
19
@abhishekunique7
Abhishek Gupta
2 years
These videos are incredible, congrats to @hausman_k, @TianheYu, and the team! Really exciting to see generative models provide big improvements in real micro kitchen environments. Looking forward to what's next!.
@hausman_k
Karol Hausman
2 years
Our most recent work showing bitter lesson 2.0 in action:.using diffusion models to augment robot data. Introducing ROSIE: Our robots can imagine new environments, objects and backgrounds! 🧵
0
1
18
@abhishekunique7
Abhishek Gupta
1 year
I'm unfortunately not at @NeurIPSConf #NeurIPS2023 this year, but luckily my excellent students and collaborators who actually did the work are! Please do visit their posters and talks and ask them very hard questions 😀 A 🧵 (1/9).
1
0
19
@abhishekunique7
Abhishek Gupta
2 years
Excited about work led by @xkelym @Zyc199539Chu @ab_deshpande! I was skeptical that we could solve these problems with RL, but they totally proved me wrong! 😄 Super interesting both from the perspective of system design and algorithmic choices! See @xkelym's🧵 with details.
@xkelym
Kay - Liyiming Ke
2 years
Let’s do 🍒 Cherry Picking with Reinforcement Learning - 🥢 Dynamic fine manipulation with chopsticks.- 🤖 Only 30 minutes of real world interactions.- ⛔️ Too lazy for parameter tuning = off-the-shelf RL algo + default params + 3 seeds in real world
1
0
18
@abhishekunique7
Abhishek Gupta
4 years
Some new insights on the problem of offline pretraining with online finetuning. Seems to work pretty well! Code is out too. @ashvinair @svlevine @mihdalal .
0
2
19
@abhishekunique7
Abhishek Gupta
6 months
An excellent piece by @stefan_milne covering some of our recent work pushing the paradigm of real-to-sim-to-real for scalable robot training, led by @ZoeyC17 @marceltornev and many others across @uwcse and @MITCSAIL! Give it a read :).
0
4
19
@abhishekunique7
Abhishek Gupta
4 years
A little video I made explaining our ICRA 2021 work on reset-free reinforcement learning for dexterous manipulation. Paper at
@svlevine
Sergey Levine
4 years
Want to know how robots can learn to give you a hand with your NeurIPS submissions? So do I. In the meantime, you can check out @abhishekunique7's ICRA 2021 talk, how to train robotic hands to do lots of other stuff🙂from scratch, in the real world.
0
5
17
@abhishekunique7
Abhishek Gupta
1 year
Our work on continual reinforcement learning that gets more and more efficient as it encounters more tasks is at CoRL 2023 this year. Come check out our poster on Nov 9, from 2:45-3:30 pm!.
@abhishekunique7
Abhishek Gupta
1 year
Check out work led by Zheyuan Hu and Aaron Rovinsky on how robot learning can get *more* efficient as it encounters more tasks! This was a pretty awesome exercise in system building and we learned a lot about making continual learning systems for real world dexterous robots.
0
1
17
@abhishekunique7
Abhishek Gupta
2 years
Excited to share our work on leveraging text2image generative models for data augmentation for robot learning! We leverage these models to generate a huge diversity of realistic scenes from very minimal on-robot data, which enables pretty cool generalization! Thread by @ZoeyC17.
@ZoeyC17
Zoey Chen
2 years
Need more data to train your robot in the real-world? Introducing GenAug, a semantic data augmentation framework to enable broad robot generalization by leveraging pre-trained text-to-image generative models. 🧵(1/N) .Paper Website
1
2
18
@abhishekunique7
Abhishek Gupta
2 years
Don’t miss a chance to work with @aviral_kumar2 :) he’s an incredible advisor already and I’m looking forward to his upcoming lab!.
@aviral_kumar2
Aviral Kumar
2 years
Thrilled to share that I will be joining Carnegie Mellon @SCSatCMU as an Assistant Professor of CS and ML @CSDatCMU @mldcmu in Fall 2024. Extremely thankful to my mentors & collaborators, especially @svlevine! Looking forward to working with amazing students & colleagues at CMU!.
0
1
18
@abhishekunique7
Abhishek Gupta
1 year
People perform things at varying levels of suboptimality, typically because of constrained computational budgets. Most modeling frameworks don't account for this. We model agents with varying levels of rationality using latent inference budgets! See @apjacob03's 🧵for more!.
@apjacob03
Athul Paul Jacob
1 year
⭐️ New Paper ⭐️. We introduce latent inference budget models (L-IBMs), a family of approaches for modeling how agents plan subject to computational constraints. Paper: 🧵👇(1/11)
Tweet media one
1
2
15
@abhishekunique7
Abhishek Gupta
1 year
Check out work led by Zheyuan Hu and Aaron Rovinsky on how robot learning can get *more* efficient as it encounters more tasks! This was a pretty awesome exercise in system building and we learned a lot about making continual learning systems for real world dexterous robots.
@svlevine
Sergey Levine
1 year
Can we get dexterous hands to learn efficiently from images entirely in the real world? With a combo of learned rewards, sample-efficient RL, and initialization from data of other tasks, robots can learn skills autonomously in a matter of hours: A 🧵👇
0
2
17
@abhishekunique7
Abhishek Gupta
4 years
@uwcse @berkeley_ai @svlevine @pabbeel In the meanwhile, I will be spending a year at @MIT_CSAIL as a post-doc working with Russ Tedrake and @pulkitology. Looking forward to a fun collaboration!.
0
0
16
@abhishekunique7
Abhishek Gupta
3 years
Yay reset free RL :) love this task setup!.
@svlevine
Sergey Levine
3 years
Don't Start From Scratch: good advice for ML with big models! Also good advice for robots with reset-free training: ARIEL allows robots to learn a new task with offline RL pretraining + online RL w/ forward and backward policy to automate resets. Thread:
0
0
16
@abhishekunique7
Abhishek Gupta
4 years
Exciting news, cannot think of anyone more deserving! Congratulations :).
@hausman_k
Karol Hausman
4 years
Super excited to announce that I've started as an Adjunct Professor @Stanford! . I'll continue to work @GoogleAI but I'll also be spending some time at Stanford, where I'll be co-advising a few students and continue co-teaching CS 330 ( 🧑‍🏫.
1
1
16
@abhishekunique7
Abhishek Gupta
1 year
Pre-trained visual representations are effective features, but @ZCCZHANG shows that they can also be used for identification of subgoals directly from long-horizon video behavior. Allows for improvements in both imitation and RL in sim and on robots. 🧵by @ZCCZHANG for more!.
@ZCCZHANG
Zichen "Charles" Zhang
1 year
How can pre-trained visual representations help solve long-horizon manipulation? 🤔. Introducing Universal Visual Decomposer (UVD), an off-the-shelf method for identifying subgoals from videos - NO extra data, training, cost, or task knowledge required. (🧵1/n)
1
4
15
@abhishekunique7
Abhishek Gupta
1 year
@breadli428 @zipengfu @tonyzzhao @chelseabfinn How are we defining robust? and I’m curious to understand what about behavior cloning is giving us robustness? If anything I would expect BC to absolutely *not* be robust so I’m surprised. In a weird way, I actually expect “low” quality data to provide more robustness.
1
0
13
@abhishekunique7
Abhishek Gupta
1 year
@natolambert Although in my experience things that are high visibility on Twitter have a somewhat loose correlation to high quality research :) and so yes you get signal, but it is often misleading. Just my 2 cents.
1
0
14
@abhishekunique7
Abhishek Gupta
8 months
Read the paper to see what makes it tick-lots of little details in there. Fun work led by @xkelym, @yunchuzh, @ab_deshpande, Quinn Pfeifer, with @siddhss5! .Paper: (robotics), (algorithmic).Website: (9/9)
1
0
14
@abhishekunique7
Abhishek Gupta
3 years
We are presenting some fun new work at NeurIPS tomorrow!.(8:30-10am PT Dec 7).- Adaptive Risk Minimization by Marvin Zhang, Henrik Marklund (.- Autonomous RL via Subgoal Curricula by @archit_sharma97 ( (1/2).
1
0
14
@abhishekunique7
Abhishek Gupta
1 year
Pretty clever and simple, great results! :).
@xiao_ted
Ted Xiao
1 year
Instead of just telling robots “what to do”, can we also guide robots by telling them “how to do” tasks?. Unveiling RT-Trajectory, our new work which introduces trajectory conditioned robot policies. These coarse trajectory sketches help robots generalize to novel tasks! 🧵⬇️
0
3
14
@abhishekunique7
Abhishek Gupta
11 months
Incredible projects from @sanjibac and the whole team! Massive respect for pulling this off :).
@sanjibac
Sanjiban Choudhury
11 months
Cooking in kitchens is fun. BUT doing it collaboratively with two robots is even more satisfying!. We introduce MOSAIC, a modular framework that coordinates multiple robots to closely collaborate and cook with humans via natural language interaction and a repository of skills.
1
0
14
@abhishekunique7
Abhishek Gupta
2 years
Yay @chuning_zhu! Well deserved :).
@uwcse
Allen School
2 years
#UWAllen Ph.D. student @chuning_zhu was named an Amazon Fellow, while prof. @SimonShaoleiDu earned a faculty research award through the @UW + @AmazonScience Hub. Both work on reinforcement learning to enable robots to perform various tasks in dynamic environments.🤖 #ThisIsUW 2/2
Tweet media one
0
0
13
@abhishekunique7
Abhishek Gupta
9 months
@ZoeyC17 Really a huge undertaking by @ZoeyC17, kudos to her and the rest of the team! We learned a lot from this project, hope it's a useful tool. Catch this work at #RSS2024! (8/8). Paper: Website: Code:
0
3
13
@abhishekunique7
Abhishek Gupta
4 years
@uwcse @berkeley_ai @svlevine @pabbeel I am looking forward to pushing the limits of real world robotic learning with great students and postdocs, please do consider applying to UW to work with us!.
2
0
12
@abhishekunique7
Abhishek Gupta
2 years
This was fun work led by @hjterrysuh in collaboration with Glen Chou, Hongkai Dai, Lujie Yang, Russ Tedrake. Check out the paper here (5/5). I personally learned a lot from Terry in this project and I think there's a lot to explore here!.
0
0
11
@abhishekunique7
Abhishek Gupta
8 months
We propose CCIL to train local Lipschitz continuous dynamics models on expert data, generate corrective labels and augment the dataset to be corrective. This lets us do better than BC because the behavior is corrective on OOD drift, enabling robust, generalizable policies! (6/9)
1
1
11
@abhishekunique7
Abhishek Gupta
1 year
Our framework, SERL, aims to fulfill two key needs - (1) an out of the box software stack for getting RL running on robots, (2) putting together the best pieces for algorithms, rewards and resets to enable sample efficient RL, minimizing environment instrumentation. (2/5)
Tweet media one
1
0
11
@abhishekunique7
Abhishek Gupta
2 years
@GlenBerseth Hmm, I think that exploration in RL is still a problem. Depends on the domain. Just because something collects data for you does not mean that 1) data is free or 2) the process of collecting data is convergent in any reasonable time. Thoughts?.
2
0
11
@abhishekunique7
Abhishek Gupta
3 years
Very cool work from @mendonca_rl @_oleh @pathak2206 and co. Very impressed with the capabilities in the kitchen domain especially! :).
@mendonca_rl
Russell Mendonca
3 years
Really excited to share our #NeurIPS2021 paper LEXA, that can reach diverse goals in challenging image-based environments. It explores beyond the frontier of known states by imagining in the latent space of a learned world model to discover interesting goals.
0
2
11
@abhishekunique7
Abhishek Gupta
3 years
In particular, I’m super interested in both practical and algorithmic aspects of RL, ranging from things like human supervision in RL to model-based RL to fast adaptation and continually improving agents deployed in the real world plus a lot more that won’t fit in a tweet! 2/3.
1
0
11
@abhishekunique7
Abhishek Gupta
3 years
Congrats @pabbeel! Been an amazing journey and a privilege to work with you 😊.
@TheOfficialACM
Association for Computing Machinery
3 years
Our warmest congratulations to Pieter Abbeel @pabbeel, recipient of the 2021 #ACMPrize for contributions to robot learning, including learning from demonstrations and deep reinforcement learning for robotic control. Learn more about Abbeel’s work here:
Tweet media one
1
1
11
@abhishekunique7
Abhishek Gupta
2 years
@uwcse We are excited about getting robots to learn in the real world, and building all the tools that are needed to do so - RL, imitation learning, meta and multi-task learning, offline learning and finetuning, reward specification, continual learning and many more. (2/3).
1
0
11
@abhishekunique7
Abhishek Gupta
1 year
This was a fun collaboration with Max Balsells, @marceltornev, Zihan Wang, Samedh Desai, @pulkitology across UW and MIT. We will be at CoRL next week presenting this work! (5/5).Paper: Project website:
2
0
10
@abhishekunique7
Abhishek Gupta
3 years
@GlenBerseth maybe the question is what you mean by "real-world". Arguably many of the deep RL applications, particularly in robotics are limited to lab settings, and not what one would characterize as "real-world", right?.
2
0
10
@abhishekunique7
Abhishek Gupta
3 years
If any of these sound cool to you, please do apply to work with me at UW here Deadline is Dec 15th, very excited about working with awesome students.We have an awesome group of faculty and students at UW, it’s going to be a blast! 3/3 @uwcse @uw_robotics.
1
0
10
@abhishekunique7
Abhishek Gupta
8 months
We can learn dynamics functions with local continuity enforced through spectral norm regularization. This yields models that can generate OOD corrective labels despite only seeing expert data. This allows policies to correct compounding error, with no extra expert data (5/9)
2
0
10
@abhishekunique7
Abhishek Gupta
11 months
@marceltornev I’m really proud of @marceltornev and team - @anthonysimeono_ , Zechu Li, April Chan, @taochenshh, @pulkitology, for putting together a really cool project!. Paper: Website: Code: Coming Soon 🙂(12/12). Hope you enjoy reading!.
0
0
10
@abhishekunique7
Abhishek Gupta
2 years
@uwcse We are part of an extremely collaborative and diverse community @uwcse, and it’s a lot of fun collaborating with researchers across robotics, theory, vision, NLP and many others to push the frontier of robot learning. Apply here by Dec 15th (3/3).
0
0
10
@abhishekunique7
Abhishek Gupta
5 years
New work led by JD, Suvansh studying what properties of an environment can make sparse reward, non episodic learning easier. We find that highly dynamic environments or environments with natural “environment shaping” can help!. @svlevine @GlenBerseth.
0
1
10
@abhishekunique7
Abhishek Gupta
2 months
As with most faculty, I do very little besides vspace edits and my excellent students do all the work, so please ask them all your difficult questions! I'll be at #NeurIPS2024 from Tue - Sat, email/DM me if you want to chat about robotics, RL, UW or anything in between! (6/6).
0
0
8
@abhishekunique7
Abhishek Gupta
4 years
Sharing some exciting new work on inferring rewards for reinforcement learning from examples of successful outcomes using uncertainty-aware classifiers. Some cool new ideas from the normalized maximum likelihood literature along with meta-learning!.
@svlevine
Sergey Levine
4 years
Can we devise a more tractable RL problem if we give the agent examples of successful outcomes (states, not demos)? In MURAL, we show that uncertainty-aware classifiers trained with (meta) NML make RL much easier. At #ICML2021 . A (short) thread:
0
0
9
@abhishekunique7
Abhishek Gupta
1 year
@breadli428 @zipengfu @tonyzzhao @chelseabfinn Right, but in some ways every method besides behavior cloning gives you this more naturally eg IRL/RL/LQR etc :) in a combinatorially large world, BC will require a combinatorially large amount of “corrective” coverage data from humans to correct for this deviation.
2
0
9
@abhishekunique7
Abhishek Gupta
1 year
@ZhangWeiHong9 This simple drop-in replacement lets you do better with various offline RL methods, simply by reweighting the objective. A fun project led by @ZhangWeiHong9 with @aviral_kumar2, @pulkitology, and many others (5/5).Paper:
0
3
9
@abhishekunique7
Abhishek Gupta
4 years
We are presenting this at #ICML2021 as a spotlight talk from 7:35-7:40 PST and during the poster session from 8-11pm PST. Come chat with us to learn more!.
@svlevine
Sergey Levine
4 years
Can we devise a more tractable RL problem if we give the agent examples of successful outcomes (states, not demos)? In MURAL, we show that uncertainty-aware classifiers trained with (meta) NML make RL much easier. At #ICML2021 . A (short) thread:
0
0
8