Lucy Shi Profile Banner
Lucy Shi Profile
Lucy Shi

@lucy_x_shi

Followers
1,491
Following
551
Media
22
Statuses
193

CS PhD student @Stanford , interning @physical_int . Working on robot learning and multimodal learning. Interested in robots, rockets, and humans.

San Francisco, CA
Joined October 2021
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@lucy_x_shi
Lucy Shi
7 months
Introducing Yell At Your Robot (YAY Robot!) 🗣️- a fun collaboration b/w @Stanford and @UCBerkeley 🤖 We enable robots to improve on-the-fly from language corrections: robots rapidly adapt in real-time and continuously improve from human verbal feedback. YAY Robot enables
19
80
470
@lucy_x_shi
Lucy Shi
1 year
Transformers excel at identifying patterns, but they falter with limited data - a common setback in robotics.🤔 Introducing Cross-Episodic Curriculum (CEC), boosting learning efficiency & generalization of Transformer agents across RL & IL settings! 🧵 To appear at #NeurIPS2023
4
30
168
@lucy_x_shi
Lucy Shi
2 years
Can robots be farsighted? We introduce SkiMo (Skill + Model-based RL), which allows more accurate and efficient long-horizon planning through temporal abstraction. SkiMo learns temporally-extended, sparse-reward tasks with 5x fewer samples! 🧵👇
3
26
127
@lucy_x_shi
Lucy Shi
1 year
Learning long-horizon tasks is hard, but it can be easier when learning in a better action space. Our new waypoint method boosts imitation learning performance and data efficiency, proving effective across 8 robotic tasks & 10 datasets. Check out Chelsea’s 🧵 for more details!
@chelseabfinn
Chelsea Finn
1 year
Our robot can now make you coffee 🤖☕ A short 🧵 on how it works ⬇️
31
131
906
5
12
64
@lucy_x_shi
Lucy Shi
6 months
As impressive as always, great work @tonyzzhao !! Seeing Tony’s dexterous manipulation policies has changed my mind about what data can solve in the past year. Now a question that keeps me up at night is what data cannot or should not solve.
@tonyzzhao
Tony Z. Zhao
6 months
Introducing 𝐀𝐋𝐎𝐇𝐀 𝐔𝐧𝐥𝐞𝐚𝐬𝐡𝐞𝐝 🌋 - Pushing the boundaries of dexterity with low-cost robots and AI. @GoogleDeepMind Finally got to share some videos after a few months. Robots are fully autonomous filmed in one continuous shot. Enjoy!
55
342
2K
1
2
39
@lucy_x_shi
Lucy Shi
4 months
Scalable data collection requires intuitive interfaces. Great work! @zipengfu
@zipengfu
Zipeng Fu
4 months
Introduce HumanPlus - Shadowing part Humanoids are born for using human data. We build a real-time shadowing system using a single RGB camera and a whole-body policy for cloning human motion. Examples: - boxing🥊 - playing the piano🎹/ping pong - tossing - typing Open-sourced!
17
166
770
1
1
19
@lucy_x_shi
Lucy Shi
7 months
A huge thanks to my amazing collaborators @real_ZheyuanHu @tonyzzhao @archit_sharma97 @KarlPertsch @jianlanluo and advisors @chelseabfinn @svlevine ! Bonus video (1x): after post-training, the robot is able to self-correct :)
1
2
18
@lucy_x_shi
Lucy Shi
7 months
Plus, some unexpected failure cases - the real world is complicated 😂
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
13
@lucy_x_shi
Lucy Shi
7 months
How does it work? A high-level policy (akin to a VLM) generates language instructions. Then, a low-level policy (end-to-end language-conditioned BC) executes the skill. This enables robots to understand language instructions and act on them.
Tweet media one
1
0
14
@lucy_x_shi
Lucy Shi
7 months
In this work we integrate language corrections to supervise language-conditioned skills in real-time, and use this feedback to iteratively improve the policy.
1
2
13
@lucy_x_shi
Lucy Shi
7 months
Long-horizon tasks are hard - the longer it is, the more likely that some stage will fail. Can humans help robots continuously improve through intuitive and natural feedback?
1
0
11
@lucy_x_shi
Lucy Shi
2 years
#CoRL2022 If you are interested in long-horizon learning or sample efficiency, come by our poster today @ 5:10pm in OGG room 040!
@JosephLim_AI
Joseph Lim
2 years
Check out our #CoRL2022 paper on learning skill dynamics for model-based RL! We present a sample efficient RL algorithms by temporal abstraction (skills). @YoungwoonLee @lucy_x_shi (an undergrad who will graduate soon!)
0
2
10
0
0
9
@lucy_x_shi
Lucy Shi
7 months
During deployment, people can intervene through corrective language commands, overriding the high-level policy for robot’s on-the-fly adaptation. These interventions are then used to post-train and improve the high-level policy.
Tweet media one
2
1
8
@lucy_x_shi
Lucy Shi
1 year
For more details: 📄Paper: 🌐Website + Code: Endless thanks to my incredible team! Co-lead: @YunfanJiang , w/ @__jakegrigsby__ @DrJimFan @yukez This journey of exploration and development has been immensely rewarding because of you!
1
0
6
@lucy_x_shi
Lucy Shi
2 years
We evaluate our method on four long-horizon, sparse-reward tasks that cover challenges in exploration, skill composition, generalization, and extremely task-agnostic datasets. Compared to prior methods, SkiMo achieves better performance and requires much fewer samples!
1
0
5
@lucy_x_shi
Lucy Shi
7 months
We find that robots continuously learn from interactions - language corrections improve the autonomous policy's performance by 20% through iterative post-training.
Tweet media one
Tweet media two
Tweet media three
1
0
6
@lucy_x_shi
Lucy Shi
2 years
Then does it predict accurately over a long horizon? We compare the predictions of 500 timesteps using a flat model and skill dynamics model. The prediction from the flat model deviates from the ground truth quickly, but the prediction of the skill dynamics model has little error
Tweet media one
1
0
4
@lucy_x_shi
Lucy Shi
1 year
“Foundation models” have catalyzed progress in large-scale research and applications. My hope is to see the emergence of "foundation hardware" in the near future. ALOHA exhibits immense potential in this regard. Check it out if you’re interested in fine manipulation!
1
0
4
@lucy_x_shi
Lucy Shi
7 months
For more results on hierarchical vs. flat BC, GPT-4V as high-level policy, impact of data quality, etc., check out our paper & website: We also open-source the code for YAY Robot, and some automated tools for collecting language-annotated robotic data.
1
0
4
@lucy_x_shi
Lucy Shi
11 months
@kylehkhsu gee, Kyle your poster looks too cool 👀
0
0
3
@lucy_x_shi
Lucy Shi
1 year
CEC stands on the shoulders of previous groundbreaking work 🙌: - Algorithm Distillation ( @MishaLaskin ) - Adaptive Agent (Adaptive Agent Team) Check out these cool concurrent works: - Agentic Transformer ( @haoliuhl ) - Decision-Pretrained Transformer (Jonathan Lee & Annie Xie)
0
0
4
@lucy_x_shi
Lucy Shi
10 months
@archit_sharma97 excuse me, sir, could I get 50 copies of your autograph? they might come in handy someday.
1
0
4
@lucy_x_shi
Lucy Shi
2 years
To investigate exploration & exploitation behaviors, we visualize trajectories in the replay buffer (light blue for early trajectories and dark blue for recent trajectories). SkiMo shows wide coverage of the maze early in the training, and fast convergence to the solution.
Tweet media one
1
0
4
@lucy_x_shi
Lucy Shi
11 months
@philduan happy thanksgiving! was getting dinner with my housemates (mostly doctors). thankfully everyone cared more about the pie than open ai 🤦‍♀️
0
0
4
@lucy_x_shi
Lucy Shi
7 months
@DrJimFan That’s exciting! Congrats Jim!!
0
0
2
@lucy_x_shi
Lucy Shi
2 years
In pretraining, SkiMo leverages offline task-agnostic data to extract skill dynamics and a skill repertoire. Unlike prior works that keep the model and skill policy training separate, we propose to _jointly_ train them to extract a skill space that is conducive to plan upon.
Tweet media one
1
0
3
@lucy_x_shi
Lucy Shi
1 year
Results speak! 🚀 CEC outperforms offline RL techniques, e.g., DT, and BC baselines trained on expert data, even exceeding RL oracles by up to 50% in *zero-shot* - all under the same parameters and data size!
Tweet media one
1
0
3
@lucy_x_shi
Lucy Shi
1 year
Phase 2��⃣: Causally distilling policy refinement into Transformer agent model weights via *cross-episodic attention*, allowing the policy to trace & internalize improved behaviors from curricular data.
1
0
3
@lucy_x_shi
Lucy Shi
2 years
Deep learning ⊂ hierarchical representation learning? 🤔
@GoogleAI
Google AI
2 years
Introducing LocoProp, a new framework that reconceives a neural network as a modular composition of layers—each of which is trained with its own weight regularizer, target output and loss function—yielding both high performance and efficiency. Read more →
15
202
936
0
0
3
@lucy_x_shi
Lucy Shi
2 years
In downstream RL, we learn a high-level task policy in the skill space (skill-based RL) and leverage the skill dynamics model to generate imaginary rollouts for policy optimization and planning (model-based RL).
Tweet media one
1
0
3
@lucy_x_shi
Lucy Shi
1 year
🤖👥 In IL settings, human demos vary in quality, but still showcase improvement patterns & generally effective manipulation skills across different operators 🎥: We leverage Transformers to extract & *extrapolate* these patterns for faster, further improvement in embodied tasks
1
0
3
@lucy_x_shi
Lucy Shi
1 year
How to maximize learning from scarce data? Key insight: looking at data _across_ episodes reveals useful improvement patterns. E.g., an RL agent acquires progressively better navigation skills 🎥:
1
0
3
@lucy_x_shi
Lucy Shi
1 year
@chenwang_j super cool work, @chenwang_j !
1
0
3
@lucy_x_shi
Lucy Shi
9 months
0
0
2
@lucy_x_shi
Lucy Shi
1 year
@SerenaLBooth Congrats Serena! They are lucky to have you!
0
0
2
@lucy_x_shi
Lucy Shi
1 year
🦾 Robust Policies: In novel test scenarios (e.g., unseen maze mechanisms, OOD difficulties, varying environment dynamics), CEC improves policy performance by up to 1.6x over RL oracles!
Tweet media one
1
0
2
@lucy_x_shi
Lucy Shi
1 year
@tonyzzhao @archit_sharma97 thanks so much Tony!
0
0
2
@lucy_x_shi
Lucy Shi
1 year
I’m grateful to @chelseabfinn @archit_sharma97 for the amazing mentorship. Also, a special shout-out to @tonyzzhao , without whom this project wouldn't have been possible:
1
0
2
@lucy_x_shi
Lucy Shi
7 months
@KarlPertsch haha looking forward to that future
0
0
2
@lucy_x_shi
Lucy Shi
1 year
Method: Cross-Episodic Curriculum (CEC) Phase 1️⃣: Formulating curricular sequences, capturing: a) policy improvement in single environments, b) learning progress in increasingly harder environments, or c) demonstrators' rising proficiency
1
0
2
@lucy_x_shi
Lucy Shi
2 years
Humans efficiently plan with high-level skills to solve complex tasks, like washing and cutting for cooking. But MBRL today typically plans with single-step models, akin to a human planning out every muscle movement. This does not scale to long-horizon tasks!
Tweet media one
2
0
2
@lucy_x_shi
Lucy Shi
2 years
SkiMo ​​learns a model that predicts the effects of whole _skills_. This allows it to skip the low-level details of skill execution when reasoning over long time horizons --> faster planning & less error accumulation!
Tweet media one
1
0
2
@lucy_x_shi
Lucy Shi
2 years
@natolambert Altogether, the agent would then plan directly over time in the skill space (choose skill -> predict outcome -> repeat) & it can predict more accurately over the long term (temporally-extended reasoning + less required planning steps). 4/4
1
0
2
@lucy_x_shi
Lucy Shi
8 months
@DrJimFan @yukez Congrats Jim and Yuke!!
0
0
1
@lucy_x_shi
Lucy Shi
6 months
@lmathur_ @lpmorency @pliang279 congrats on the release!!
0
0
2
@lucy_x_shi
Lucy Shi
1 year
to clarify “dancing to the beat” - even though I'd be thrilled if our robot could learn to dance 🕺, the jerkiness at the end was simply because it’s unsure of its next move after task completion. The sync with background music is a happy coincidence 😂
0
0
2
@lucy_x_shi
Lucy Shi
7 months
0
0
1
@lucy_x_shi
Lucy Shi
7 months
@marceltornev Thanks Marcel! If I didn’t mention it last week - I like your recent work a lot!!
0
0
1
@lucy_x_shi
Lucy Shi
7 months
@ChongZitaZhang @Stanford @UCBerkeley ah that’s interesting! we don’t have this kind of high-level semantics in the data so don’t really know. I’d be curious! we’ve only tried motion/skill generalization bf - e.g. “wiggle” seems to generalize well to different objects
0
0
1
@lucy_x_shi
Lucy Shi
7 months
@tonyzzhao lol wouldn’t have been possible without you :)
0
0
1
@lucy_x_shi
Lucy Shi
4 months
0
0
1
@lucy_x_shi
Lucy Shi
2 years
Maybe RL today also has too much coupling to remain stable… Jokes aside, this is really an insightful & eloquent piece!
@stevenstrogatz
Steven Strogatz
2 years
In 2013, I was asked to contribute to a collection of essays on the theme, "What should we be worried about?" This was my answer at the time.
Tweet media one
29
149
794
2
0
1
@lucy_x_shi
Lucy Shi
4 months
0
0
1
@lucy_x_shi
Lucy Shi
7 months
@real_ZheyuanHu kinda wish the robot kids grow up quicker
0
0
1
@lucy_x_shi
Lucy Shi
2 years
@natolambert The policy is abstracted through skills (≈options) as continuous variables that encode action sequences (currently w/ fixed-length 10 for stability, variable-length skills will be an interesting future direction). 2/4
0
0
1
@lucy_x_shi
Lucy Shi
2 years
0
0
1
@lucy_x_shi
Lucy Shi
1 year
🤖 Continuous Robotic Control: On two simulated robotic manipulation tasks, CEC matches or even surpasses established baselines!
Tweet media one
1
0
1
@lucy_x_shi
Lucy Shi
2 years
@DavidChen930109 Hey, I don’t think there’s documentation for Franka Kitchen in particular, but maybe you want to check out the D4RL website & repo for more info
2
0
1
@lucy_x_shi
Lucy Shi
1 year
@archit_sharma97 Thanks so much Archit! 🙂
0
0
1
@lucy_x_shi
Lucy Shi
1 year
@YoungwoonLee Congrats!!
0
0
1
@lucy_x_shi
Lucy Shi
1 year
Initially, I thought learning long-horizon bimanual fine manipulation tasks on real robots would be a nightmare. Surprisingly, it was an absolute joy. All credit goes to Tony's low-cost open-source hardware system, ALOHA 👏
@tonyzzhao
Tony Z. Zhao
2 years
Introducing ALOHA 🏖: 𝐀 𝐋ow-cost 𝐎pen-source 𝐇𝐀rdware System for Bimanual Teleoperation After 8 months iterating @stanford and 2 months working with beta users, we are finally ready to release it! Here is what ALOHA is capable of:
94
712
3K
1
0
1
@lucy_x_shi
Lucy Shi
1 year
@SerenaLBooth Congrats!!
0
0
1
@lucy_x_shi
Lucy Shi
7 months
@siddkaramcheti Thanks Sidd!! Your previous works are super inspiring
0
0
1
@lucy_x_shi
Lucy Shi
11 months
@CongyueD Happy birthday!!
1
0
1