Abhishek Gupta @abhishekunique7 profile

Abhishek Gupta

@abhishekunique7

Followers

7K

Following

721

Media

135

Statuses

417

Assistant Professor at University of Washington. I like robots, and reinforcement learning. Previously: post-doc at MIT, PhD at Berkeley

Seattle, WA

Joined February 2012

Don't wanna be here? Send us removal request.

Abhishek Gupta

@abhishekunique7

4 years

Thrilled to share that I will be starting as an assistant professor at the University of Washington @uwcse in Fall 2022! Grateful for wonderful mentors and collaborators at @berkeley_ai, especially @svlevine and @pabbeel. Looking forward to joining the wonderful folks @uwcse!.

33

25

463

Abhishek Gupta

@abhishekunique7

1 year

Anyone who knows me knows I love real world RL :) But anyone who works on real-world RL knows it’s quite a pain to get going. We tried to make everyone’s life easier by writing a software suite to get you going with real world RL out of the box, without all the pain! A 🧵(1/5)

3

32

253

Abhishek Gupta

@abhishekunique7

8 months

So I hear that behavior cloning is all the rage now. What if we could do better, but with the same data? :) In CCIL, we show that imitation via BC is improved by synthesizing corrective labels to account for compounding error, without interactive oracles. Lets you do 👇! 🧵(1/9)

5

39

255

Abhishek Gupta

@abhishekunique7

2 years

I am recruiting PhD students to join us in the Washington Embodied Intelligence and Robotic Development Lab (WEIRD) at @uwcse. We work on robot learning, especially RL in the real world! Check out for details (1/3).

7

44

218

Abhishek Gupta

@abhishekunique7

1 year

Imagine this: you drop your robot in an environment, connect it to the internet and come back 10 hours later, and it has learned to solve tasks in the real world, autonomously, with no effort from you! We enable this in our work -Guided Exploration for Autonomous RL (GEAR)🧵(1/5)

3

23

180

Abhishek Gupta

@abhishekunique7

2 years

Excited to share our work on uncertainty estimation using diffusion/score matching! The idea is simple: offline optimization (eg model-based RL, imitation) require us to estimate uncertainty. Estimating uncertainty is hard - score matching provides a scalable solution. A🧵(1/5)

2

17

172

Abhishek Gupta

@abhishekunique7

2 years

Excited to share our work on self-supervised RL by modeling random features. The key premise behind RaMP is to learn about environment dynamics, without learning a dynamics model! This allows for transfer, without accruing compounding error. A🧵 (1/6).

1

30

160

Abhishek Gupta

@abhishekunique7

1 year

Intrigued by decision transformers, we investigated why and when we should use return-conditioned RL as an alternative to dynamic prog (DP). Our findings are neat! With data coverage, RCSL can outperform DP, but fail to "stitch" trajectories. We analyze and propose a fix. 🧵(1/N)

4

21

164

Abhishek Gupta

@abhishekunique7

9 months

Excited about @ZoeyC17's new work on real2sim for robotics! We present URDFormer, a technique to learn models that go from RGB images to full articulated scene URDFs in sim by "inverting" pre-trained generative models. These can be used to train robots for the real-world! 🧵(1/8)

2

23

144

Abhishek Gupta

@abhishekunique7

10 months

So you want to do robotics tasks requiring dynamics information in the real world, but you don’t want the pain of real-world RL? In our work to be presented as an oral at ICLR 2024, @memmelma showed how we can do this via a real-to-sim-to-real policy learning approach. A 🧵 (1/7)

1

25

137

Abhishek Gupta

@abhishekunique7

2 months

Haven't been to a conference in a while, really excited to be at #NeurIPS2024! I'll be helping present 4 of our group's recent papers: .1. Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL 2. Distributional

2

16

124

Abhishek Gupta

@abhishekunique7

3 years

Excited to share our work on reset-free fine-tuning bootstrapped by offline data. We show results in a real-world kitchen, with a robot practicing autonomously to improve for over a day with minimal intervention! .Paper: Website:

5

21

123

Abhishek Gupta

@abhishekunique7

5 years

Reinforcement learning can be significantly accelerated by using offline datasets with a simple, but carefully designed actor critic algorithm! Solves dexterous manipulation tasks in <1 hour. @ashvinair M Dalal @svlevine.

0

22

118

Abhishek Gupta

@abhishekunique7

2 years

“How can you enable your parents to train your robot?”.We propose a system for enabling robot learning by hooking up a robot to the web, using noisy, occasional feedback from non-experts to guide exploration. Enables robot learning in sim and real w/out reward engineering!🧵(1/8)

2

21

115

Abhishek Gupta

@abhishekunique7

2 months

In my experience, robot 'generalists' are often jacks of all trades but masters of none. In training across multiple tasks and environments, robot policies fail to generalize robustly and effectively to each particular test setting. What if at test time, we non-parametrically

2

22

117

Abhishek Gupta

@abhishekunique7

11 months

Robot learning in the real world can be expensive and unsafe in human-centric environments. Solution: Construct simulation on the fly and train in it!.Excited to share RialTo, led by @marceltornev on learning resilient policies via real-to-sim-to-real policy learning! A 🧵 (1/12)

2

22

115

Abhishek Gupta

@abhishekunique7

3 years

Excited to be working with all these amazing people very soon! Exciting times ahead😀 On that note I'm also hoping to recruit students this cycle to start in Fall 22. If you like ML and robotics and want to get things to work in the real world, definitely apply to UW!, 1/3.

Allen School

@uwcse

3 years

Not even a pandemic could slow down #UWAllen faculty hiring. Over the past 2 cycles, we welcomed 15 (yes—15!) outstanding researchers and educators who have joined/will soon join us at @uwengineering @UW Seattle. Meet these new members of our community:

1

9

111

Abhishek Gupta

@abhishekunique7

9 months

Who doesn’t love good methods for reward inference. What if I told you that you could extract dense rewards from video, by ranking frames temporally using the BT model from RLHF (aka just doing temporal classification with cross-entropy). Let's see how, in rank2reward - a🧵(1/10)

2

19

104

Abhishek Gupta

@abhishekunique7

2 years

New work from my time at MIT!.We introduce Distributionally Adaptive Meta-Reinforcement Learning (DiAMetR) - Meta-RL struggles when test-tasks are OOD, which arguably is most of the time! We propose an algorithm resilient to distribution shift. 🧵 (1/N)

1

13

100

Abhishek Gupta

@abhishekunique7

1 year

Want to get model-based RL to work in diverse, dynamic scenes? Check out @chuning_zhu's latest work (RePo) on model-based reinforcement learning without reconstruction, where we show how to learn world models that scale to dynamic, multi-task environments. A 🧵(1/6)

5

17

94

Abhishek Gupta

@abhishekunique7

2 months

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been

1

22

86

Abhishek Gupta

@abhishekunique7

2 years

I'm truly so tired of reading reviews about "novelty". What does that even mean. #ICML2023.

3

81

Abhishek Gupta

@abhishekunique7

1 year

Most offline RL methods try to constrain policies from deviating far from the offline data distribution. In cases where the data distribution is imbalanced or suboptimal, this makes it hard to actually learn good behavior! In new work, @ZhangWeiHong9 proposes a solution 🧵 (1/5)

4

9

80

Abhishek Gupta

@abhishekunique7

2 months

Over the last year, we’ve been investigating how simulation can be a useful tool for real-world reinforcement learning on a robot. While simulation captures inherently incorrect dynamics, it can still be useful for real-world learning! In our #NeurIPS2024 work, Andrew W.

3

18

76

Abhishek Gupta

@abhishekunique7

4 years

We've been working on getting robots to learn in the real world with many hours of autonomous reset free RL! Key idea is to leverage multi-task RL to enable scalable learning with no human intervention. Allows learning of cool dexterous manipulation tasks in the real world!.

Sergey Levine

@svlevine

4 years

After over a year of development, we're finally releasing our work on real-world dexterous manipulation: MTRF. MTRF learns complex dexterous manipulation skills *directly in the real world* via continuous and fully autonomous trial-and-error learning. Thread below ->

1

11

68

Abhishek Gupta

@abhishekunique7

5 years

Sharing two recent talks from my advisor @svlevine covering much of my recent work, as well as work from many of my colleagues. I really enjoyed watching these, they give a really cool perspective on frontiers of RL.

3

12

61

Abhishek Gupta

@abhishekunique7

4 years

New work on learning how to grasp and navigate with mobile robots using RL. What I find very exciting is the ability of the system to be trained for > 60 hrs with minimal intervention, learning in diverse scenarios. Paper: .Website:

4

8

58

Abhishek Gupta

@abhishekunique7

4 years

I did a podcast thing! Here's a recent interview on Applying RL to Real-World Robotics with @samcharrington for the @twimlai podcast. Check it out! via @twimlai.

0

9

59

Abhishek Gupta

@abhishekunique7

2 years

Excited to share the first of several papers toward leveraging generative models as data sources for RL! RL sees minimal data, gen models see lots of data. We show that gen models (here LLMs) can provide background info for RL common sense (here exploration)! Thread by @d_yuqing!.

Yuqing Du

@d_yuqing

2 years

How can we encourage RL agents to explore human-meaningful behaviors *without* a human in the loop?. @OliviaGWatkins2 and I are excited to share “Guiding Pretraining in Reinforcement Learning with LLMs”! . 📜🧵1/

1

5

56

Abhishek Gupta

@abhishekunique7

5 years

Fun blog post on our work on unsupervised meta-reinforcement learning, for doing meta-reinforcement learning without explicit human provided task distributions!. And associated paper

0

19

47

Abhishek Gupta

@abhishekunique7

3 months

How can we enable transferable decision-making for *any* reward zero-shot? MBRL is task-agnostic but suffers from compounding error, while MFRL is task-specific. We propose a new class of world models that transfers across tasks zero-shot and avoids compounding error! A 🧵 (1/9)

1

11

52

Abhishek Gupta

@abhishekunique7

2 years

Excited about our work on understanding the benefits of reward shaping! Reward shaping is critical in a large portion of practical RL problems and this paper tries to understand when and why it helps. Terrific collaboration with @aldopacchiano Simon Zhai @svlevine @ShamKakade6!.

Sergey Levine

@svlevine

2 years

In theory RL is intractable w/o exploration bonuses. In practice, we rarely use them. What's up with that? Critical to practical RL is reward shaping, but there is little theory about it. Our new paper analyzes sample complexity w/ shaped rewards: Thread:

0

5

51

Abhishek Gupta

@abhishekunique7

6 months

While investigating RLHF methods last year, @sriyash__ and @yanming_wan noted that human annotators in a population often display diverse and conflicting preferences. While typical RLHF methods struggle with this diversity, we developed new techniques for plurastic RLHF! 🧵(1/7)

1

10

46

Abhishek Gupta

@abhishekunique7

3 years

Tried to share some tips on faculty applications, do take a listen if you're thinking of applying. Hope it can be helpful! Thanks for having me @talkingrobotics!.

Talking Robotics

@talkingrobotics

3 years

"Start writing your research statement in the summer." Abhishek Gupta provided the BEST ADVICE if you are preparing for the #academic #job #market. This talk has TONS OF TIPS from his own experience in the job market last year. Listen now (links below). @uw @uw_robotics @uwcse

2

4

45

Abhishek Gupta

@abhishekunique7

5 years

Presenting "Ingredients of Real World Robotic RL" at ICLR 2020, 4/26 10pm-12am PST & 4/27 5am-7am PST. Blog: Paper: Descriptive Video: .Poster livestream:

1

12

37

Abhishek Gupta

@abhishekunique7

11 months

Excited to share a new large-scale dataset for in-the-wild robotic learning! It was an honestly eye-opening experience for our whole group to be a part of this. Thanks to @SashaKhazatsky, @KarlPertsch and the rest of the team for putting together an amazing dataset! 🤖.

Alexander Khazatsky

@SashaKhazatsky

11 months

After two years, it is my pleasure to introduce “DROID: A Large-Scale In-the-Wild Robot Manipulation Dataset”. DROID is the most diverse robotic interaction dataset ever released, including 385 hours of data collected across 564 diverse scenes in real-world households and offices

1

39

Abhishek Gupta

@abhishekunique7

2 years

I’m very very excited about led by @avivnet at #ICLR2023 on learning deep control policies that can extrapolate using a transductive approach. We show how we can get neural network policies to extrapolate without significant domain-specific assumptions. A 🧵 to explain how: (1/6).

1

3

37

Abhishek Gupta

@abhishekunique7

4 years

Some cool new updated results for offline pre-training followed by online fine-tuning with AWAC (advantage-weighted actor-critic). Offline RL does cool things on robots!.

Sergey Levine

@svlevine

4 years

How can we get robots to solve complex tasks with RL? Pretrain with *offline* RL using prior data, and then finetune with *online* RL! In our updated paper on AWAC (advantage-weighted actor-critic), we describe a new set of robot experiments: thread ->

0

6

36

Abhishek Gupta

@abhishekunique7

3 years

Excited to share our work on benchmarking reset free RL. We hope this presents a way to go beyond the standard episodic assumptions made in robotic RL, making it practical for the real world!.

Archit Sharma

@archit_sharma97

3 years

Embodied agents such as humans and robots live in a continual non-episodic world. Why do we continue to develop RL algorithms in episodic settings? This discrepancy also presents a practical challenge -- algorithms rely on extrinsic interventions (often humans) to learn .

1

35

Abhishek Gupta

@abhishekunique7

3 years

Excited to share a new blog post on our work on learning informative rewards for RL! By considering a more tractable class of outcome driven RL problems and a particular choice of uncertainty aware classifier, we learn more informative reward functions

3

7

31

Abhishek Gupta

@abhishekunique7

11 months

Exciting to see what @pabbeel, Anusha Nagabandi, @clavera_i, @CarlosFlorensa, Nikhil Mishra and other friends at covariant have been up to!.

Covariant

@CovariantAI

11 months

Today, we are introducing RFM-1, our Robotics Foundation Model giving robots human-like reasoning capabilities.

0

5

33

Abhishek Gupta

@abhishekunique7

2 years

I remember when I was first starting to work on dexterous hands, we were thinking about how to find and grasp objects in the dark with touch sensing. Here are our initial attempts at this problem.

1

2

32

Abhishek Gupta

@abhishekunique7

8 days

@harshit_sikchi When one starts to feel the AGI😁.

0

32

Abhishek Gupta

@abhishekunique7

6 months

MIT covering some of our work! Led by @marceltornev along with @pulkitology @anthonysimeono_ @taochenshh and others. Give it a read :).

MIT CSAIL

@MIT_CSAIL

6 months

To automate time-consuming tasks like household chores, robots must be precise & robust for very specific environments. With MIT’s “RialTo” method, users can scan their surroundings w/their phone so a robot can practice in a digital twin environment. This novel

0

2

24

Abhishek Gupta

@abhishekunique7

3 years

Yay! Very well deserved @pabbeel!.

IEEE Awards

@IEEEAwards

3 years

Congratulations to @UCBerkeley’s Pieter Abbeel (@pabbeel) on receiving the 2022 @IEEEorg Kiyo Tomiyasu Award, sponsored by the late Dr. Kiyo Tomiyasu, @IEEE_GRSS, and @IEEEMTT, for contributions to #DeepLearning for #Robotics: #IEEEAwards2022 #IEEETFAs

1

0

24

Abhishek Gupta

@abhishekunique7

9 months

I'm unfortunately not at @iclr_conf, but our group and collaborators are presenting 4 papers this year! Come meet the awesome students presenting this work :) A 🧵 (1/5).

1

23

Abhishek Gupta

@abhishekunique7

3 months

Some of our most exciting work on new ways to do world modeling and zero-shot transfer! This work is important in reimagining what a generalizable world model looks like beyond autoregressive prediction. Check out @chuning_zhu's thread for details.

Chuning Zhu

@chuning_zhu

3 months

How can we train RL agents that transfer to any reward? In our @NeurIPSConf paper DiSPO, we propose to learn the distribution of successor features of a stationary dataset, which enables zero-shot transfer to arbitrary rewards without additional training! A thread 🧵(1/9)

0

2

23

Abhishek Gupta

@abhishekunique7

1 year

Check out RoboHive - our new unified robot learning framework, tons of cool new environments, tasks, platforms. We hope this can be a helpful tool for folks in robot learning and beyond!.

Vikash Kumar

@Vikashplus

1 year

📢#𝗥𝗼𝗯𝗼𝗛𝗶𝘃𝗲 - a unified robot learning framework . ✅Designed for genralizn first robot-learning era.✅Diverse (500 envs, 8 domain).✅Single flag for Sim<>Real.✅TeleOper Support.✅Multi-(Skill x Task) realworld dataset.✅pip install robohive. 🧵👇

0

2

21

Abhishek Gupta

@abhishekunique7

2 years

Gave a talk on dirty laundry in RL, ala advice from @Ken_Goldberg. Situated this in some dexterous manipulation work. Recordings should be up soon, y’all might enjoy it :) thanks @notmahi and the other organizers!.

Mahi Shafiullah 🏠🤖

@notmahi

2 years

The first workshop on Learning Dexterous Manipulation at @RoboticsSciSys is starting now! Check out our speaker lineup at or tune in via zoom at if you are not in person.

0

1

22

Abhishek Gupta

@abhishekunique7

1 year

We hope this can be a useful tool to help use RL on your robots! Happy RL-ing. Website: Code: w/ @jianlanluo,@real_ZheyuanHu, Charles Xu, @youliangtan, @archit_sharma97, Stefan Schaal, @chelseabfinn, @svlevine (5/5).

1

21

Abhishek Gupta

@abhishekunique7

2 years

Excited to share work led by Max Simchowitz on principled ways to approach combinatorial generalization using bilinear embeddings. Useful under “combinatorial” distribution shift - eg you’ve seen blue mugs, red mugs and blue cups, what happens when you see red cups? A 🧵 (1/3).

1

2

21

Abhishek Gupta

@abhishekunique7

3 months

Max is 100% one of the smartest people I know and a fantastic mentor, go work with him!.

Max Simchowitz

@max_simchowitz

3 months

A very exciting personal update: In January, I’ll be joining @CMUMLD as tenure-track assistant professor! My lab will focus on the mathematical foundations of, and new algorithms, for decision making. This includes everything from reinforcement learning in the physical world

0

20

Abhishek Gupta

@abhishekunique7

3 months

Check out our new work on learning human-AI cooperation agents using generative models. Led by @liangyanchenggg and @Daphne__Chen, to be presented at #NeurIPS2024.The overcooked game in the browser is fun to play :).

Yancheng Liang

@liangyanchenggg

3 months

🎉 Excited to release our #NeurIPS2024 paper on zero-shot human-AI cooperation. For the first time, we use generative models to sample infinite human-like training partners to train a Cooperator agent. 🔥Experience it! 🚀Check out our 𝐥𝐢𝐯𝐞 𝐝𝐞𝐦𝐨 👉

1

5

20

Abhishek Gupta

@abhishekunique7

1 year

If you're at #NeurIPS2023, check out @badsethcohen 's work on generative BC! A cool look into how to realize stability guarantees for imitation learning, in theory and practice .Poster: Thu 14 Dec 10:45 a.m. CST — 12:45 p.m. CST, #1427.Paper:

0

2

19

Abhishek Gupta

@abhishekunique7

9 months

Real2Sim is great, exciting to see this 👏.

Xuanlin Li (Simon)

@XuanlinLi2

9 months

Scalable, reproducible, and reliable robotic evaluation remains an open challenge, especially in the age of generalist robot foundation models. Can *simulation* effectively predict *real-world* robot policy performance & behavior?. Presenting SIMPLER!👇.

0

4

19

Abhishek Gupta

@abhishekunique7

2 years

These videos are incredible, congrats to @hausman_k, @TianheYu, and the team! Really exciting to see generative models provide big improvements in real micro kitchen environments. Looking forward to what's next!.

Karol Hausman

@hausman_k

2 years

Our most recent work showing bitter lesson 2.0 in action:.using diffusion models to augment robot data. Introducing ROSIE: Our robots can imagine new environments, objects and backgrounds! 🧵

0

1

18

Abhishek Gupta

@abhishekunique7

1 year

I'm unfortunately not at @NeurIPSConf #NeurIPS2023 this year, but luckily my excellent students and collaborators who actually did the work are! Please do visit their posters and talks and ask them very hard questions 😀 A 🧵 (1/9).

1

0

19

Abhishek Gupta

@abhishekunique7

2 years

Excited about work led by @xkelym @Zyc199539Chu @ab_deshpande! I was skeptical that we could solve these problems with RL, but they totally proved me wrong! 😄 Super interesting both from the perspective of system design and algorithmic choices! See @xkelym's🧵 with details.

Kay - Liyiming Ke

@xkelym

2 years

Let’s do 🍒 Cherry Picking with Reinforcement Learning - 🥢 Dynamic fine manipulation with chopsticks.- 🤖 Only 30 minutes of real world interactions.- ⛔️ Too lazy for parameter tuning = off-the-shelf RL algo + default params + 3 seeds in real world

1

0

18

Abhishek Gupta

@abhishekunique7

4 years

Some new insights on the problem of offline pretraining with online finetuning. Seems to work pretty well! Code is out too. @ashvinair @svlevine @mihdalal .

0

2

19

Abhishek Gupta

@abhishekunique7

6 months

An excellent piece by @stefan_milne covering some of our recent work pushing the paradigm of real-to-sim-to-real for scalable robot training, led by @ZoeyC17 @marceltornev and many others across @uwcse and @MITCSAIL! Give it a read :).

0

4

19

Abhishek Gupta

@abhishekunique7

4 years

A little video I made explaining our ICRA 2021 work on reset-free reinforcement learning for dexterous manipulation. Paper at

Sergey Levine

@svlevine

4 years

Want to know how robots can learn to give you a hand with your NeurIPS submissions? So do I. In the meantime, you can check out @abhishekunique7's ICRA 2021 talk, how to train robotic hands to do lots of other stuff🙂from scratch, in the real world.

0

5

17

Abhishek Gupta

@abhishekunique7

1 year

Our work on continual reinforcement learning that gets more and more efficient as it encounters more tasks is at CoRL 2023 this year. Come check out our poster on Nov 9, from 2:45-3:30 pm!.

Abhishek Gupta

@abhishekunique7

1 year

Check out work led by Zheyuan Hu and Aaron Rovinsky on how robot learning can get *more* efficient as it encounters more tasks! This was a pretty awesome exercise in system building and we learned a lot about making continual learning systems for real world dexterous robots.

0

1

17

Abhishek Gupta

@abhishekunique7

2 years

Excited to share our work on leveraging text2image generative models for data augmentation for robot learning! We leverage these models to generate a huge diversity of realistic scenes from very minimal on-robot data, which enables pretty cool generalization! Thread by @ZoeyC17.

Zoey Chen

@ZoeyC17

2 years

Need more data to train your robot in the real-world? Introducing GenAug, a semantic data augmentation framework to enable broad robot generalization by leveraging pre-trained text-to-image generative models. 🧵(1/N) .Paper Website

1

2

18

Abhishek Gupta

@abhishekunique7

2 years

Don’t miss a chance to work with @aviral_kumar2 :) he’s an incredible advisor already and I’m looking forward to his upcoming lab!.

Aviral Kumar

@aviral_kumar2

2 years

Thrilled to share that I will be joining Carnegie Mellon @SCSatCMU as an Assistant Professor of CS and ML @CSDatCMU @mldcmu in Fall 2024. Extremely thankful to my mentors & collaborators, especially @svlevine! Looking forward to working with amazing students & colleagues at CMU!.

0

1

18

Abhishek Gupta

@abhishekunique7

1 year

People perform things at varying levels of suboptimality, typically because of constrained computational budgets. Most modeling frameworks don't account for this. We model agents with varying levels of rationality using latent inference budgets! See @apjacob03's 🧵for more!.

Athul Paul Jacob

@apjacob03

1 year

⭐️ New Paper ⭐️. We introduce latent inference budget models (L-IBMs), a family of approaches for modeling how agents plan subject to computational constraints. Paper: 🧵👇(1/11)

1

2

15

Abhishek Gupta

@abhishekunique7

1 year

Check out work led by Zheyuan Hu and Aaron Rovinsky on how robot learning can get *more* efficient as it encounters more tasks! This was a pretty awesome exercise in system building and we learned a lot about making continual learning systems for real world dexterous robots.

Sergey Levine

@svlevine

1 year

Can we get dexterous hands to learn efficiently from images entirely in the real world? With a combo of learned rewards, sample-efficient RL, and initialization from data of other tasks, robots can learn skills autonomously in a matter of hours: A 🧵👇

0

2

17

Abhishek Gupta

@abhishekunique7

4 years

@uwcse @berkeley_ai @svlevine @pabbeel In the meanwhile, I will be spending a year at @MIT_CSAIL as a post-doc working with Russ Tedrake and @pulkitology. Looking forward to a fun collaboration!.

0

16

Abhishek Gupta

@abhishekunique7

3 years

Yay reset free RL :) love this task setup!.

Sergey Levine

@svlevine

3 years

Don't Start From Scratch: good advice for ML with big models! Also good advice for robots with reset-free training: ARIEL allows robots to learn a new task with offline RL pretraining + online RL w/ forward and backward policy to automate resets. Thread:

0

16

Abhishek Gupta

@abhishekunique7

4 years

Exciting news, cannot think of anyone more deserving! Congratulations :).

Karol Hausman

@hausman_k

4 years

Super excited to announce that I've started as an Adjunct Professor @Stanford! . I'll continue to work @GoogleAI but I'll also be spending some time at Stanford, where I'll be co-advising a few students and continue co-teaching CS 330 ( 🧑‍🏫.

1

16

Abhishek Gupta

@abhishekunique7

1 year

Pre-trained visual representations are effective features, but @ZCCZHANG shows that they can also be used for identification of subgoals directly from long-horizon video behavior. Allows for improvements in both imitation and RL in sim and on robots. 🧵by @ZCCZHANG for more!.

Zichen "Charles" Zhang

@ZCCZHANG

1 year

How can pre-trained visual representations help solve long-horizon manipulation? 🤔. Introducing Universal Visual Decomposer (UVD), an off-the-shelf method for identifying subgoals from videos - NO extra data, training, cost, or task knowledge required. (🧵1/n)

1

4

15

Abhishek Gupta

@abhishekunique7

1 year

@breadli428 @zipengfu @tonyzzhao @chelseabfinn How are we defining robust? and I’m curious to understand what about behavior cloning is giving us robustness? If anything I would expect BC to absolutely *not* be robust so I’m surprised. In a weird way, I actually expect “low” quality data to provide more robustness.

1

0

13

Abhishek Gupta

@abhishekunique7

1 year

@natolambert Although in my experience things that are high visibility on Twitter have a somewhat loose correlation to high quality research :) and so yes you get signal, but it is often misleading. Just my 2 cents.

1

0

14

Abhishek Gupta

@abhishekunique7

8 months

Read the paper to see what makes it tick-lots of little details in there. Fun work led by @xkelym, @yunchuzh, @ab_deshpande, Quinn Pfeifer, with @siddhss5! .Paper: (robotics), (algorithmic).Website: (9/9)

1

0

14

Abhishek Gupta

@abhishekunique7

3 years

We are presenting some fun new work at NeurIPS tomorrow!.(8:30-10am PT Dec 7).- Adaptive Risk Minimization by Marvin Zhang, Henrik Marklund (.- Autonomous RL via Subgoal Curricula by @archit_sharma97 ( (1/2).

1

0

14

Abhishek Gupta

@abhishekunique7

1 year

Pretty clever and simple, great results! :).

Ted Xiao

@xiao_ted

1 year

Instead of just telling robots “what to do”, can we also guide robots by telling them “how to do” tasks?. Unveiling RT-Trajectory, our new work which introduces trajectory conditioned robot policies. These coarse trajectory sketches help robots generalize to novel tasks! 🧵⬇️

0

3

14

Abhishek Gupta

@abhishekunique7

11 months

Incredible projects from @sanjibac and the whole team! Massive respect for pulling this off :).

Sanjiban Choudhury

@sanjibac

11 months

Cooking in kitchens is fun. BUT doing it collaboratively with two robots is even more satisfying!. We introduce MOSAIC, a modular framework that coordinates multiple robots to closely collaborate and cook with humans via natural language interaction and a repository of skills.

1

0

14

Abhishek Gupta

@abhishekunique7

2 years

Yay @chuning_zhu! Well deserved :).

Allen School

@uwcse

2 years

#UWAllen Ph.D. student @chuning_zhu was named an Amazon Fellow, while prof. @SimonShaoleiDu earned a faculty research award through the @UW + @AmazonScience Hub. Both work on reinforcement learning to enable robots to perform various tasks in dynamic environments.🤖 #ThisIsUW 2/2

0

13

Abhishek Gupta

@abhishekunique7

9 months

@ZoeyC17 Really a huge undertaking by @ZoeyC17, kudos to her and the rest of the team! We learned a lot from this project, hope it's a useful tool. Catch this work at #RSS2024! (8/8). Paper: Website: Code:

0

3

13

Abhishek Gupta

@abhishekunique7

4 years

@uwcse @berkeley_ai @svlevine @pabbeel I am looking forward to pushing the limits of real world robotic learning with great students and postdocs, please do consider applying to UW to work with us!.

2

0

12

Abhishek Gupta

@abhishekunique7

2 years

This was fun work led by @hjterrysuh in collaboration with Glen Chou, Hongkai Dai, Lujie Yang, Russ Tedrake. Check out the paper here (5/5). I personally learned a lot from Terry in this project and I think there's a lot to explore here!.

0

11

Abhishek Gupta

@abhishekunique7

8 months

We propose CCIL to train local Lipschitz continuous dynamics models on expert data, generate corrective labels and augment the dataset to be corrective. This lets us do better than BC because the behavior is corrective on OOD drift, enabling robust, generalizable policies! (6/9)

1

11

Abhishek Gupta

@abhishekunique7

1 year

Our framework, SERL, aims to fulfill two key needs - (1) an out of the box software stack for getting RL running on robots, (2) putting together the best pieces for algorithms, rewards and resets to enable sample efficient RL, minimizing environment instrumentation. (2/5)

1

0

11

Abhishek Gupta

@abhishekunique7

2 years

@GlenBerseth Hmm, I think that exploration in RL is still a problem. Depends on the domain. Just because something collects data for you does not mean that 1) data is free or 2) the process of collecting data is convergent in any reasonable time. Thoughts?.

2

0

11

Abhishek Gupta

@abhishekunique7

3 years

Very cool work from @mendonca_rl @_oleh @pathak2206 and co. Very impressed with the capabilities in the kitchen domain especially! :).

Russell Mendonca

@mendonca_rl

3 years

Really excited to share our #NeurIPS2021 paper LEXA, that can reach diverse goals in challenging image-based environments. It explores beyond the frontier of known states by imagining in the latent space of a learned world model to discover interesting goals.

0

2

11

Abhishek Gupta

@abhishekunique7

3 years

In particular, I’m super interested in both practical and algorithmic aspects of RL, ranging from things like human supervision in RL to model-based RL to fast adaptation and continually improving agents deployed in the real world plus a lot more that won’t fit in a tweet! 2/3.

1

0

11

Abhishek Gupta

@abhishekunique7

3 years

Congrats @pabbeel! Been an amazing journey and a privilege to work with you 😊.

Association for Computing Machinery

@TheOfficialACM

3 years

Our warmest congratulations to Pieter Abbeel @pabbeel, recipient of the 2021 #ACMPrize for contributions to robot learning, including learning from demonstrations and deep reinforcement learning for robotic control. Learn more about Abbeel’s work here:

1

11

Abhishek Gupta

@abhishekunique7

2 years

@uwcse We are excited about getting robots to learn in the real world, and building all the tools that are needed to do so - RL, imitation learning, meta and multi-task learning, offline learning and finetuning, reward specification, continual learning and many more. (2/3).

1

0

11

Abhishek Gupta

@abhishekunique7

1 year

This was a fun collaboration with Max Balsells, @marceltornev, Zihan Wang, Samedh Desai, @pulkitology across UW and MIT. We will be at CoRL next week presenting this work! (5/5).Paper: Project website:

2

0

10

Abhishek Gupta

@abhishekunique7

3 years

@GlenBerseth maybe the question is what you mean by "real-world". Arguably many of the deep RL applications, particularly in robotics are limited to lab settings, and not what one would characterize as "real-world", right?.

2

0

10

Abhishek Gupta

@abhishekunique7

3 years

If any of these sound cool to you, please do apply to work with me at UW here Deadline is Dec 15th, very excited about working with awesome students.We have an awesome group of faculty and students at UW, it’s going to be a blast! 3/3 @uwcse @uw_robotics.

1

0

10

Abhishek Gupta

@abhishekunique7

8 months

We can learn dynamics functions with local continuity enforced through spectral norm regularization. This yields models that can generate OOD corrective labels despite only seeing expert data. This allows policies to correct compounding error, with no extra expert data (5/9)

2

0

10

Abhishek Gupta

@abhishekunique7

11 months

@marceltornev I’m really proud of @marceltornev and team - @anthonysimeono_ , Zechu Li, April Chan, @taochenshh, @pulkitology, for putting together a really cool project!. Paper: Website: Code: Coming Soon 🙂(12/12). Hope you enjoy reading!.

0

10

Abhishek Gupta

@abhishekunique7

2 years

@uwcse We are part of an extremely collaborative and diverse community @uwcse, and it’s a lot of fun collaborating with researchers across robotics, theory, vision, NLP and many others to push the frontier of robot learning. Apply here by Dec 15th (3/3).

0

10

Abhishek Gupta

@abhishekunique7

5 years

New work led by JD, Suvansh studying what properties of an environment can make sparse reward, non episodic learning easier. We find that highly dynamic environments or environments with natural “environment shaping” can help!. @svlevine @GlenBerseth.

0

1

10

Abhishek Gupta

@abhishekunique7

2 months

As with most faculty, I do very little besides vspace edits and my excellent students do all the work, so please ask them all your difficult questions! I'll be at #NeurIPS2024 from Tue - Sat, email/DM me if you want to chat about robotics, RL, UW or anything in between! (6/6).

0

8

Abhishek Gupta

@abhishekunique7

4 years

Sharing some exciting new work on inferring rewards for reinforcement learning from examples of successful outcomes using uncertainty-aware classifiers. Some cool new ideas from the normalized maximum likelihood literature along with meta-learning!.

Sergey Levine

@svlevine

4 years

Can we devise a more tractable RL problem if we give the agent examples of successful outcomes (states, not demos)? In MURAL, we show that uncertainty-aware classifiers trained with (meta) NML make RL much easier. At #ICML2021 . A (short) thread:

0

9

Abhishek Gupta

@abhishekunique7

1 year

@breadli428 @zipengfu @tonyzzhao @chelseabfinn Right, but in some ways every method besides behavior cloning gives you this more naturally eg IRL/RL/LQR etc :) in a combinatorially large world, BC will require a combinatorially large amount of “corrective” coverage data from humans to correct for this deviation.

2

0

9

Abhishek Gupta

@abhishekunique7

1 year

@ZhangWeiHong9 This simple drop-in replacement lets you do better with various offline RL methods, simply by reweighting the objective. A fun project led by @ZhangWeiHong9 with @aviral_kumar2, @pulkitology, and many others (5/5).Paper:

0

3

9

Abhishek Gupta

@abhishekunique7

4 years

We are presenting this at #ICML2021 as a spotlight talk from 7:35-7:40 PST and during the poster session from 8-11pm PST. Come chat with us to learn more!.

Sergey Levine

@svlevine

4 years

Can we devise a more tractable RL problem if we give the agent examples of successful outcomes (states, not demos)? In MURAL, we show that uncertainty-aware classifiers trained with (meta) NML make RL much easier. At #ICML2021 . A (short) thread:

0

8