Lucy Shi @lucy_x_shi profile

Lucy Shi

@lucy_x_shi

Followers

1,491

Following

551

Media

22

Statuses

193

CS PhD student @Stanford , interning @physical_int . Working on robot learning and multimodal learning. Interested in robots, rockets, and humans.

https://t.co/5K9PwhGPti

San Francisco, CA

Joined October 2021

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Botafogo • 175227 Tweets

#AgathaAllAlong • 139661 Tweets

Flamengo • 99518 Tweets

ドラフト • 70349 Tweets

Brey • 64930 Tweets

Warriors • 61365 Tweets

Riquelme • 53756 Tweets

Cruzeiro • 52588 Tweets

Lilia • 48396 Tweets

Anderson Cooper • 38799 Tweets

#AEWDynamite • 33753 Tweets

Romero • 32150 Tweets

Steph • 23644 Tweets

Gimnasia • 23629 Tweets

#CNNTownHall • 23449 Tweets

Sanatan Sanskriti • 23377 Tweets

Suns • 16911 Tweets

非公認側 • 13998 Tweets

Clippers • 12755 Tweets

Luiz Henrique • 11947 Tweets

Savarino • 10757 Tweets

Dana Bash • 10418 Tweets

Baliño

Luis Roberto

Booker

キーくん

Britos

Barboza

Hornets

Zenón

Wiggins

Calcaterra

Pol Fernández

Merentiel

非公認候補側2000万円支給

Patti LuPone

Kennesaw State

CRカップ

コレクション缶バッジ

Ureña

京極くん

Harden

#わがまち釜揚げうどん47

Hield

Figal

ゼインドライバー

Draymond

#2NE1BKK2025

京極ちゃん

Peñarol

Last Seen Profiles

@laura_gogartyOT

@oooiwa

@1upcasting

@is_a__bel

@brokenrules

@nameIsQui

@k_kojikoji00

@seira_sks

@jeremyworlock

@Elmande45121209

@SOCTconnecticut

@exactlyindeed

@scummyy0

@Traeshmaen

@al_1k1

@CezaryXR

@YemenVoice2

@imvtWjIOCzK8tf2

@ConnorHaynes34

@NewYork_Jack

Pinned Tweet

Lucy Shi

@lucy_x_shi

7 months

Introducing Yell At Your Robot (YAY Robot!) 🗣️- a fun collaboration b/w @Stanford and @UCBerkeley 🤖 We enable robots to improve on-the-fly from language corrections: robots rapidly adapt in real-time and continuously improve from human verbal feedback. YAY Robot enables

19

80

470

Lucy Shi

@lucy_x_shi

1 year

Transformers excel at identifying patterns, but they falter with limited data - a common setback in robotics.🤔 Introducing Cross-Episodic Curriculum (CEC), boosting learning efficiency & generalization of Transformer agents across RL & IL settings! 🧵 To appear at #NeurIPS2023

4

30

168

Lucy Shi

@lucy_x_shi

2 years

Can robots be farsighted? We introduce SkiMo (Skill + Model-based RL), which allows more accurate and efficient long-horizon planning through temporal abstraction. SkiMo learns temporally-extended, sparse-reward tasks with 5x fewer samples! 🧵👇

3

26

127

Lucy Shi

@lucy_x_shi

1 year

Learning long-horizon tasks is hard, but it can be easier when learning in a better action space. Our new waypoint method boosts imitation learning performance and data efficiency, proving effective across 8 robotic tasks & 10 datasets. Check out Chelsea’s 🧵 for more details!

Chelsea Finn

@chelseabfinn

1 year

Our robot can now make you coffee 🤖☕ A short 🧵 on how it works ⬇️

31

131

906

5

12

64

Lucy Shi

@lucy_x_shi

6 months

As impressive as always, great work @tonyzzhao !! Seeing Tony’s dexterous manipulation policies has changed my mind about what data can solve in the past year. Now a question that keeps me up at night is what data cannot or should not solve.

Tony Z. Zhao

@tonyzzhao

6 months

Introducing 𝐀𝐋𝐎𝐇𝐀 𝐔𝐧𝐥𝐞𝐚𝐬𝐡𝐞𝐝 🌋 - Pushing the boundaries of dexterity with low-cost robots and AI. @GoogleDeepMind Finally got to share some videos after a few months. Robots are fully autonomous filmed in one continuous shot. Enjoy!

55

342

2K

1

2

39

Lucy Shi

@lucy_x_shi

4 months

Scalable data collection requires intuitive interfaces. Great work! @zipengfu

Zipeng Fu

@zipengfu

4 months

Introduce HumanPlus - Shadowing part Humanoids are born for using human data. We build a real-time shadowing system using a single RGB camera and a whole-body policy for cloning human motion. Examples: - boxing🥊 - playing the piano🎹/ping pong - tossing - typing Open-sourced!

17

166

770

1

19

Lucy Shi

@lucy_x_shi

7 months

A huge thanks to my amazing collaborators @real_ZheyuanHu @tonyzzhao @archit_sharma97 @KarlPertsch @jianlanluo and advisors @chelseabfinn @svlevine ! Bonus video (1x): after post-training, the robot is able to self-correct :)

1

2

18

Lucy Shi

@lucy_x_shi

7 months

Plus, some unexpected failure cases - the real world is complicated 😂

0

1

13

Lucy Shi

@lucy_x_shi

7 months

How does it work? A high-level policy (akin to a VLM) generates language instructions. Then, a low-level policy (end-to-end language-conditioned BC) executes the skill. This enables robots to understand language instructions and act on them.

1

0

14

Lucy Shi

@lucy_x_shi

7 months

In this work we integrate language corrections to supervise language-conditioned skills in real-time, and use this feedback to iteratively improve the policy.

1

2

13

Lucy Shi

@lucy_x_shi

7 months

Long-horizon tasks are hard - the longer it is, the more likely that some stage will fail. Can humans help robots continuously improve through intuitive and natural feedback?

1

0

11

Lucy Shi

@lucy_x_shi

2 years

#CoRL2022 If you are interested in long-horizon learning or sample efficiency, come by our poster today @ 5:10pm in OGG room 040!

Joseph Lim

@JosephLim_AI

2 years

Check out our #CoRL2022 paper on learning skill dynamics for model-based RL! We present a sample efficient RL algorithms by temporal abstraction (skills). @YoungwoonLee @lucy_x_shi (an undergrad who will graduate soon!)

0

2

10

0

9

Lucy Shi

@lucy_x_shi

7 months

During deployment, people can intervene through corrective language commands, overriding the high-level policy for robot’s on-the-fly adaptation. These interventions are then used to post-train and improve the high-level policy.

2

1

8

Lucy Shi

@lucy_x_shi

1 year

For more details: 📄Paper: 🌐Website + Code: Endless thanks to my incredible team! Co-lead: @YunfanJiang , w/ @__jakegrigsby__ @DrJimFan @yukez This journey of exploration and development has been immensely rewarding because of you!

1

0

6

Lucy Shi

@lucy_x_shi

2 years

Joint work w/ @YoungwoonLee and @JosephLim_AI For more details and videos, check out the paper and website. We'll also make the code available soon. Paper: Project website: Happy to answer any questions! ✨

Skill-based Model-based Reinforcement Learning

Model-based reinforcement learning (RL) is a sample-efficient way of learning complex behaviors by leveraging a learned single-step dynamics model to plan actions in imagination. However, planning...

arxiv.org

1

0

6

Lucy Shi

@lucy_x_shi

2 years

We evaluate our method on four long-horizon, sparse-reward tasks that cover challenges in exploration, skill composition, generalization, and extremely task-agnostic datasets. Compared to prior methods, SkiMo achieves better performance and requires much fewer samples!

1

0

5

Lucy Shi

@lucy_x_shi

7 months

We find that robots continuously learn from interactions - language corrections improve the autonomous policy's performance by 20% through iterative post-training.

1

0

6

Lucy Shi

@lucy_x_shi

2 years

Then does it predict accurately over a long horizon? We compare the predictions of 500 timesteps using a flat model and skill dynamics model. The prediction from the flat model deviates from the ground truth quickly, but the prediction of the skill dynamics model has little error

1

0

4

Lucy Shi

@lucy_x_shi

1 year

“Foundation models” have catalyzed progress in large-scale research and applications. My hope is to see the emergence of "foundation hardware" in the near future. ALOHA exhibits immense potential in this regard. Check it out if you’re interested in fine manipulation!

1

0

4

Lucy Shi

@lucy_x_shi

7 months

For more results on hierarchical vs. flat BC, GPT-4V as high-level policy, impact of data quality, etc., check out our paper & website: We also open-source the code for YAY Robot, and some automated tools for collecting language-annotated robotic data.

1

0

4

Lucy Shi

@lucy_x_shi

11 months

@kylehkhsu gee, Kyle your poster looks too cool 👀

0

3

Lucy Shi

@lucy_x_shi

1 year

CEC stands on the shoulders of previous groundbreaking work 🙌: - Algorithm Distillation ( @MishaLaskin ) - Adaptive Agent (Adaptive Agent Team) Check out these cool concurrent works: - Agentic Transformer ( @haoliuhl ) - Decision-Pretrained Transformer (Jonathan Lee & Annie Xie)

0

4

Lucy Shi

@lucy_x_shi

10 months

@archit_sharma97 excuse me, sir, could I get 50 copies of your autograph? they might come in handy someday.

1

0

4

Lucy Shi

@lucy_x_shi

2 years

To investigate exploration & exploitation behaviors, we visualize trajectories in the replay buffer (light blue for early trajectories and dark blue for recent trajectories). SkiMo shows wide coverage of the maze early in the training, and fast convergence to the solution.

1

0

4

Lucy Shi

@lucy_x_shi

11 months

@philduan happy thanksgiving! was getting dinner with my housemates (mostly doctors). thankfully everyone cared more about the pie than open ai 🤦‍♀️

0

4

Lucy Shi

@lucy_x_shi

7 months

@DrJimFan That’s exciting! Congrats Jim!!

0

2

Lucy Shi

@lucy_x_shi

2 years

In pretraining, SkiMo leverages offline task-agnostic data to extract skill dynamics and a skill repertoire. Unlike prior works that keep the model and skill policy training separate, we propose to _jointly_ train them to extract a skill space that is conducive to plan upon.

1

0

3

Lucy Shi

@lucy_x_shi

1 year

Results speak! 🚀 CEC outperforms offline RL techniques, e.g., DT, and BC baselines trained on expert data, even exceeding RL oracles by up to 50% in *zero-shot* - all under the same parameters and data size!

1

0

3

Lucy Shi

@lucy_x_shi

1 year

Phase 2��⃣: Causally distilling policy refinement into Transformer agent model weights via *cross-episodic attention*, allowing the policy to trace & internalize improved behaviors from curricular data.

1

0

3

Lucy Shi

@lucy_x_shi

2 years

Deep learning ⊂ hierarchical representation learning? 🤔

Google AI

@GoogleAI

2 years

Introducing LocoProp, a new framework that reconceives a neural network as a modular composition of layers—each of which is trained with its own weight regularizer, target output and loss function—yielding both high performance and efficiency. Read more →

15

202

936

0

3

Lucy Shi

@lucy_x_shi

2 years

In downstream RL, we learn a high-level task policy in the skill space (skill-based RL) and leverage the skill dynamics model to generate imaginary rollouts for policy optimization and planning (model-based RL).

1

0

3

Lucy Shi

@lucy_x_shi

1 year

@ShuangL13799063 @icmlconf @siggraph @UofT @VectorInst Congrats & all the best!! 😍

0

3

Lucy Shi

@lucy_x_shi

1 year

🤖👥 In IL settings, human demos vary in quality, but still showcase improvement patterns & generally effective manipulation skills across different operators 🎥: We leverage Transformers to extract & *extrapolate* these patterns for faster, further improvement in embodied tasks

1

0

3

Lucy Shi

@lucy_x_shi

1 year

How to maximize learning from scarce data? Key insight: looking at data _across_ episodes reveals useful improvement patterns. E.g., an RL agent acquires progressively better navigation skills 🎥:

1

0

3

Lucy Shi

@lucy_x_shi

1 year

@chenwang_j super cool work, @chenwang_j !

1

0

3

Lucy Shi

@lucy_x_shi

9 months

@AllanZhou17 lmaooo

0

2

Lucy Shi

@lucy_x_shi

1 year

@SerenaLBooth Congrats Serena! They are lucky to have you!

0

2

Lucy Shi

@lucy_x_shi

1 year

🦾 Robust Policies: In novel test scenarios (e.g., unseen maze mechanisms, OOD difficulties, varying environment dynamics), CEC improves policy performance by up to 1.6x over RL oracles!

1

0

2

Lucy Shi

@lucy_x_shi

1 year

@tonyzzhao @archit_sharma97 thanks so much Tony!

0

2

Lucy Shi

@lucy_x_shi

1 year

I’m grateful to @chelseabfinn @archit_sharma97 for the amazing mentorship. Also, a special shout-out to @tonyzzhao , without whom this project wouldn't have been possible:

1

0

2

Lucy Shi

@lucy_x_shi

7 months

@KarlPertsch haha looking forward to that future

0

2

Lucy Shi

@lucy_x_shi

7 months

@ericjang11 @Stanford @UCBerkeley ❤️

0

2

Lucy Shi

@lucy_x_shi

1 year

Method: Cross-Episodic Curriculum (CEC) Phase 1️⃣: Formulating curricular sequences, capturing: a) policy improvement in single environments, b) learning progress in increasingly harder environments, or c) demonstrators' rising proficiency

1

0

2

Lucy Shi

@lucy_x_shi

2 years

Humans efficiently plan with high-level skills to solve complex tasks, like washing and cutting for cooking. But MBRL today typically plans with single-step models, akin to a human planning out every muscle movement. This does not scale to long-horizon tasks!

2

0

2

Lucy Shi

@lucy_x_shi

2 years

SkiMo learns a model that predicts the effects of whole _skills_. This allows it to skip the low-level details of skill execution when reasoning over long time horizons --> faster planning & less error accumulation!

1

0

2

Lucy Shi

@lucy_x_shi

2 years

@natolambert Altogether, the agent would then plan directly over time in the skill space (choose skill -> predict outcome -> repeat) & it can predict more accurately over the long term (temporally-extended reasoning + less required planning steps). 4/4

1

0

2

Lucy Shi

@lucy_x_shi

8 months

@DrJimFan @yukez Congrats Jim and Yuke!!

0

1

Lucy Shi

@lucy_x_shi

6 months

@lmathur_ @lpmorency @pliang279 congrats on the release!!

0

2

Lucy Shi

@lucy_x_shi

1 year

to clarify “dancing to the beat” - even though I'd be thrilled if our robot could learn to dance 🕺, the jerkiness at the end was simply because it’s unsure of its next move after task completion. The sync with background music is a happy coincidence 😂

0

2

Lucy Shi

@lucy_x_shi

7 months

@shaohua0116 @Stanford @UCBerkeley Thanks Shao-Hua!!

0

1

Lucy Shi

@lucy_x_shi

7 months

@marceltornev Thanks Marcel! If I didn’t mention it last week - I like your recent work a lot!!

0

1

Lucy Shi

@lucy_x_shi

7 months

@ChongZitaZhang @Stanford @UCBerkeley ah that’s interesting! we don’t have this kind of high-level semantics in the data so don’t really know. I’d be curious! we’ve only tried motion/skill generalization bf - e.g. “wiggle” seems to generalize well to different objects

0

1

Lucy Shi

@lucy_x_shi

7 months

@xiao_ted @Stanford @UCBerkeley Thanks Ted!!

0

1

Lucy Shi

@lucy_x_shi

7 months

@tonyzzhao lol wouldn’t have been possible without you :)

0

1

Lucy Shi

@lucy_x_shi

7 months

@tomssilver @MITEECS @nishanthkumar23 Congrats Tom!!

0

1

Lucy Shi

@lucy_x_shi

4 months

@du_yilun @KempnerInst Congrats Yilun!!

0

1

Lucy Shi

@lucy_x_shi

2 years

Maybe RL today also has too much coupling to remain stable… Jokes aside, this is really an insightful & eloquent piece!

Steven Strogatz

@stevenstrogatz

2 years

In 2013, I was asked to contribute to a collection of essays on the theme, "What should we be worried about?" This was my answer at the time.

29

149

794

2

0

1

Lucy Shi

@lucy_x_shi

4 months

@YunzhuLiYZ @ColumbiaCompSci Congrats!!

0

1

Lucy Shi

@lucy_x_shi

7 months

@real_ZheyuanHu kinda wish the robot kids grow up quicker

0

1

Lucy Shi

@lucy_x_shi

2 years

@natolambert The policy is abstracted through skills (≈options) as continuous variables that encode action sequences (currently w/ fixed-length 10 for stability, variable-length skills will be an interesting future direction). 2/4

0

1

Lucy Shi

@lucy_x_shi

5 months

@SerenaLBooth @BrownUniversity @BrownCSDept Congrats Serena!!

0

1

Lucy Shi

@lucy_x_shi

2 years

@andrey_kurenkov Thanks!!

0

1

Lucy Shi

@lucy_x_shi

1 year

🤖 Continuous Robotic Control: On two simulated robotic manipulation tasks, CEC matches or even surpasses established baselines!

1

0

1

Lucy Shi

@lucy_x_shi

2 years

@DavidChen930109 Hey, I don’t think there’s documentation for Franka Kitchen in particular, but maybe you want to check out the D4RL website & repo for more info

2

0

1

Lucy Shi

@lucy_x_shi

7 months

@JasonMa2020 @dineshjayaraman @obastani Congrats!!

1

0

1

Lucy Shi

@lucy_x_shi

1 year

@archit_sharma97 Thanks so much Archit! 🙂

0

1

Lucy Shi

@lucy_x_shi

1 year

@YoungwoonLee Congrats!!

0

1

Lucy Shi

@lucy_x_shi

1 year

Initially, I thought learning long-horizon bimanual fine manipulation tasks on real robots would be a nightmare. Surprisingly, it was an absolute joy. All credit goes to Tony's low-cost open-source hardware system, ALOHA 👏

Tony Z. Zhao

@tonyzzhao

2 years

Introducing ALOHA 🏖: 𝐀 𝐋ow-cost 𝐎pen-source 𝐇𝐀rdware System for Bimanual Teleoperation After 8 months iterating @stanford and 2 months working with beta users, we are finally ready to release it! Here is what ALOHA is capable of: