Zhecheng Yuan
@fancy_yzc
Followers
572
Following
702
Media
21
Statuses
148
PhD @Tsinghua University, IIIS. Interested in reinforcement learning, representation learning, robotics.
Joined July 2021
👐How can we leverage multi-source human motion data, transform it into robot-feasible behaviors, and deploy it across diverse scenarios? 👤🤖Introduce 𝐇𝐄𝐑𝐌𝐄𝐒: a versatile human-to-robot embodied learning framework tailored for mobile bimanual dexterous manipulation.
8
43
175
Sim-to-real learning for humanoid robots is a full-stack problem. Today, Amazon FAR is releasing a full-stack solution: Holosoma. To accelerate research, we are open-sourcing a complete codebase covering multiple simulation backends, training, retargeting, and real-world
19
131
571
This is exactly the kind of advancement that shows how robotics is moving from isolated capabilities to truly embodied intelligence. Collecting manipulation data alone isn’t enough, robots need to navigate, plan, and act in complex spaces, and combining navigation with dexterous
Just collecting manipulation data isn’t enough for robots - they need to be able to move around in the world, which has a whole different set of challenges from pure manipulation. And bringing navigation and manipulation together in a single framework is even more challenging.
0
5
6
This was a really fun discussion, and the results are very cool to see. Sim to real learning that allows you to have robots perform manipulation tasks in all kinds of environments!
Thrilled to chat with @chris_j_paxton and @micoolcho on @robopapers about our recent work HERMES. 🤖✨ Vision-based sim2real is getting a lot more attention lately — hope our paper can offer some fresh insights to the community.
3
3
34
Using multiple sources (mocap, human video, teleop in sim) for dexterous manipulation PLUS navigation. @fancy_yzc and @still_wtm packed a lot in this one paper! Tks for the sharing guys!
Just collecting manipulation data isn’t enough for robots - they need to be able to move around in the world, which has a whole different set of challenges from pure manipulation. And bringing navigation and manipulation together in a single framework is even more challenging.
0
8
18
Very cool work using human video to teach robots mobile manipulation behaviors!
Just collecting manipulation data isn’t enough for robots - they need to be able to move around in the world, which has a whole different set of challenges from pure manipulation. And bringing navigation and manipulation together in a single framework is even more challenging.
3
9
65
Thrilled to chat with @chris_j_paxton and @micoolcho on @robopapers about our recent work HERMES. 🤖✨ Vision-based sim2real is getting a lot more attention lately — hope our paper can offer some fresh insights to the community.
Full episode dropping soon! Geeking out with @fancy_yzc @still_wtm on HERMES: Human-to-Robot Embodied Learning From Multi-Source Motion Data for Mobile Dexterous Manipulation https://t.co/0uKgxiJYSb Co-hosted by @micoolcho @chris_j_paxton
0
3
19
Three demos showing SharpaWave’s fine manipulation under high-fidelity teleoperation. Task No.1: putting on a trash bag. With teleoperation and tactile feedback, it finds the rim, opens it, and wraps it neatly — even on a slippery, deformable object. #Sharpa #SharpaWave
16
35
189
Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity 🧵->
431
661
5K
Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: https://t.co/eC1ftH5ozx Arxiv: https://t.co/5K9sXDNQWv Gallant is, to our knowledge, the first system to run a single policy that handles full-space
1
34
187
After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3
80
505
4K
Unified multimodal models can generate text and images, but can they truly reason across modalities? 🎨 Introducing ROVER, the first benchmark that evaluates reciprocal cross-modal reasoning in unified models, the next frontier of omnimodal intelligence. 🌐 Project:
5
31
176
Ever want to enjoy all the privileged information in sim while seamlessly transferring to the real world? How can we correct policy mistakes after deployment? 👉Introducing GSWorld, a real2sim2real photo-realistic simulator with interaction physics with fully open-sourced code.
6
65
279
Excited to share SoftMimic -- a new approach for learning compliant humanoid policies that interact gently with the world.
14
114
628
Introducing @HarryXu12 HERMES, a unified human-to-robot learning framework built on the Galaxea mobile base and A1 dual-arm platform. With high-fidelity simulation and dexterous hands, HERMES enables robust sim2real transfer for complex mobile manipulation tasks. #Robotics
0
2
4
Imitation learning provides a solid prior, and Online learning further refines the policy for better performance. In the lab, I’ve been watching Kun’s policy grow stronger over time.😆
Introducing RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning. https://t.co/tZ0gz6OTdb 7 real robot tasks, 900/900 successes. Up to 250 consecutive trials in one task, running 2 hours nonstop without failure. High success rate against physical
1
1
6
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)
56
335
2K
We open-sourced the full pipeline! Data conversion from MimicKit, training recipe, pretrained checkpoint, and deployment instructions. Train your own spin kick with mjlab: https://t.co/KvNQn0Edzr
github.com
Train a Unitree G1 humanoid to perform a double spin kick using mjlab - mujocolab/g1_spinkick_example
7
77
387
🚀 Introducing SARM: Stage-Aware Reward Modeling for Long-Horizon Robot Manipulation Robots struggle with tasks like folding a crumpled T-shirt—long, contact-rich, and hard to label. We propose a scalable reward modeling framework to fix that. 1/n
4
28
165
Our grand finale: A complex, long-horizon dynamic sequence, all driven by a proprioceptive-only policy (no vision/LIDAR)! In this task, the robot carries a chair to a platform, uses it as a step to climb up, then leaps off and performs a parkour-style roll to absorb the landing.
5
28
156