Toru Profile
Toru

@ToruO_O

Followers
857
Following
184
Media
15
Statuses
86

理一@東大(UTokyo) → Course 6 @MIT → PhD student @berkeley_ai 🌈 she/her/hers I like capybaras :D

Joined October 2020
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@ToruO_O
Toru
8 months
Achieving bimanual dexterity with RL + Sim2Real! TLDR - We train two robot hands to twist bottle lids using deep RL followed by sim-to-real. A single policy trained with simple simulated bottles can generalize to drastically different real-world objects.
5
59
218
@ToruO_O
Toru
6 months
Imitation learning works™ – but you need good data 🥹 How to get high-quality visuotactile demos from a bimanual robot with multifingered hands, and learn smooth policies? Check our new work “Learning Visuotactile Skills with Two Multifingered Hands”! 🙌
8
75
280
@ToruO_O
Toru
2 years
It’s year 2023 and PPO is still going strong 🙂 open-sourcing a simple & performant PPO that trains Cartpole in <1m (with IsaacGym) same power as IsaacGymEnvs' default but everything in 4 files / under 1k lines of code
3
21
180
@ToruO_O
Toru
5 months
A common question we get for HATO () is: can it be more dexterous? Yes! The first iteration of our system actually achieves this -- by capturing finger poses with mocap gloves and remapping them to robot hands. [video taken in late 2023 (with @yuzhang )]
6
34
142
@ToruO_O
Toru
1 year
A fun side project from a while back : D Idea: teleop a dexterous hand with $5 flex sensors could be improved via better sensors, exoskeleton structures, etc... not following up but if anyone's interested, happy to share details & help! (with @brenthyi & Ting-Hao Wang)
2
2
79
@ToruO_O
Toru
1 year
Update -- to appear in NeurIPS 2023 : D TL;DR - a simple, general, flexible framework for novelty-based RL exploration "MIMEx: Masked Input Modeling for Exploration" arxiv: code: w/ @ajabri [a rare two-author & student-only work]
@ToruO_O
Toru
1 year
New work! 🧵=> In RL (and perhaps in life), exploration is about maximizing one's knowledge of the environment so as to act more optimally in the long run. It is the *only* way through which an agent can learn to improve upon itself from scratch, and thus of central importance.
2
2
46
3
4
66
@ToruO_O
Toru
1 year
New work! 🧵=> In RL (and perhaps in life), exploration is about maximizing one's knowledge of the environment so as to act more optimally in the long run. It is the *only* way through which an agent can learn to improve upon itself from scratch, and thus of central importance.
2
2
46
@ToruO_O
Toru
1 year
Weekly research progress: didn't train any neural network but I proudly present this beautiful camera mount
Tweet media one
4
0
40
@ToruO_O
Toru
11 months
Unfortunately neither my co-author nor I will be at NeurIPS 2023 but I still put in some effort to make this poster 🥹
Tweet media one
3
0
33
@ToruO_O
Toru
6 months
To address the first challenge, we develop HATO ("dove" 🕊️ in Japanese), a low-cost Hands-Arms TeleOperation system using Meta Quest 2. Our system demonstrates precise motion control capabilities and human-like dexterity -- we even controlled our robot to play Hollow Knight!
1
6
29
@ToruO_O
Toru
5 months
@YuZhang The system can be used to control a flexible number of DoFs with joint retargeting & fingertip IK. Video below shows how multiple hands-arms setup (Allegro, @PSYONICinc ) can be teleoperated to solve rubik's cube, open nutella bottle, disassemble chair, and tear paper. [4x speed]
1
2
11
@ToruO_O
Toru
5 months
@YuZhang @PSYONICinc @hellorokoko Personally, I still think a setup like HATO+gloves or ALOHA/GELLO+gloves will be great for teleoperating bimanual hands. Progress is happening -- look forward to more exciting development on this front in the next few months! ( @kenny__shaw ;)))
1
0
10
@ToruO_O
Toru
6 months
Thank you Nathan! We believe in a future where our robots can learn to play video games with visuotactile+proprioceptive data :) hopefully they will then have some work-life balance too
@robot_trainer
Nathan Ratliff
6 months
This is cool. Leverage the engineering ingenuity of prosthetic hands for tactile + dexterity with low-dof controls. Also, the teleoperated video game play (control the hand to operate a game controller) is pretty fun :)
0
0
5
0
0
8
@ToruO_O
Toru
1 year
found this gem as I went through phone album...
Tweet media one
0
0
7
@ToruO_O
Toru
1 year
But exploration is hard, especially in challenging environment with sparse rewards! To encourage exploration, we derive intrinsic reward based on masked prediction loss on input trajectory sequences: the higher the loss, the more "novel" the trajectory, and the higher the reward.
1
0
6
@ToruO_O
Toru
5 months
@YuZhang @PSYONICinc So why didn’t we go with it? Gloves are promising, but suffer from issues such as overheating, drift and recalibration ( @hellorokoko ). As mentioned in an earlier tweet (), HATO enables *much* faster data collection on many tasks that require hand dexterity.
1
0
6
@ToruO_O
Toru
5 months
@YuZhang @PSYONICinc @hellorokoko @kenny__shaw And shout-out to Franklin @YuZhang , an amazing undergraduate student who has contributed a lot to setting up these teleoperation systems despite having no prior experience at all! Franklin is applying to PhD programs this year, and I would not hesitate to recommend :)
0
0
6
@ToruO_O
Toru
6 months
Indeed, we find that hands, even when used with limited DoFs, can perform a large set of tasks & outperform grippers because of the additional stability! This also allows for smoother and more intuitive teleop (e.g. no retargeting issue), leading to much faster data collection :D
@wenlong_huang
Wenlong Huang
6 months
Very incredible to see how capable VR-controlled robot hands can be. While there is a lot of debate on grippers vs hands, why not think of hands as just grippers that offer more redundancy and stability? Congrats on the great work!
1
2
23
0
0
6
@ToruO_O
Toru
2 years
To say that this is depressing would be an understatement. The implication goes far beyond abortion rights. Clarence Thomas wrote in his concurring opinion that contraception, same-sex relationships, and same-sex marriage should also be reconsidered. A big step backwards.
@MichelleObama
Michelle Obama
2 years
My thoughts on the Supreme Court's decision to overturn Roe v. Wade.
Tweet media one
18K
172K
706K
0
0
5
@ToruO_O
Toru
1 year
~ introducing ~ "MIMEx: Masked Input Modeling for Exploration" arxiv: code: a simple, general and flexible framework for intrinsic reward methods; superior results when compared to common exploration baselines like RND and ICM
Tweet media one
1
0
5
@ToruO_O
Toru
6 months
To tackle the latter challenge, we adapt two @PSYONICinc prosthetic hands with touch sensors for research, enabling visuotactile data collection with HATO. We learn skills that can complete long-horizon, high-precision tasks and generalize to different environment settings.
1
0
4
@ToruO_O
Toru
8 months
@QinYuzhe @zhaohengyin @HaozhiQ Thank you Yuzhe, our sim2real buddy 🤜🏻🤛🏻 Really learned a lot from you too!!!
0
0
3
@ToruO_O
Toru
8 months
We use 3D keypoints extracted from RGBD images to represent an object. Specifically, we segment and track object parts from the RGB frames (left), take mask centers as object part centers (middle), and estimate 3D object keypoints using noisy depth information (right).
1
0
2
@ToruO_O
Toru
8 months
Link to arxiv: Shout out to amazing collaborators that made this happen!!! @zhaohengyin @HaozhiQ @pabbeel @JitendraMalikCV
1
0
2
@ToruO_O
Toru
1 year
Intuitively, using trajectory-level bonus encourages more novel action sequences rather than one-step transitions, giving rise to more complex behaviors and allowing more efficient exploration; it also allows for more flexible control over the difficulty of prediction problem.
1
0
3
@ToruO_O
Toru
8 months
@_akhaliq Thanks for featuring our work AK!! Project page:
0
1
2
@ToruO_O
Toru
1 year
By setting up prediction problems on trajectory sequences, MIMEx obtains intrinsic rewards that consider transition dynamics across longer time horizons and extract richer exploration signals.
1
0
3
@ToruO_O
Toru
6 months
@PSYONICinc Project page: Huge thanks to amazing collaborators -- @YZ_Franklin , @qiyang_li , @HaozhiQ , @brenthyi , @svlevine , and @JitendraMalikCV !!! from @berkeley_ai 🤖
0
0
3
@ToruO_O
Toru
6 months
We tackle two key challenges: 1. Lack of affordable and accessible teleoperation systems for bimanual multifingered hands 2. Lack of good hand hardware equipped with touch sensing
1
0
3
@ToruO_O
Toru
7 months
@Bubble_eio @ajabri Thank you for your interest! The "noise" baseline is simply adding random action noise as in the original PPO implementation, so you can reproduce it by running without exploration (e.g. as in the no_expl config file).
1
0
2
@ToruO_O
Toru
11 months
@_oleh Got the title wrong bruh 🥲
0
0
1
@ToruO_O
Toru
6 months
@chichengcc Thank you Cheng! Indeed, look forward to seeing your next steps too 😄
0
0
2
@ToruO_O
Toru
4 months
@QinYuzhe @xiaolonw Congrats Dr.Qin!
0
0
2
@ToruO_O
Toru
8 months
@HarryXu12 Thank you Harry!! Really excited to contribute to the comeback of RL & Sim2Real :))))))
0
0
2
@ToruO_O
Toru
1 year
@andrea_bajcsy @CMU_Robotics All the best for your new journey Andrea! We'll miss you dearly
0
0
2
@ToruO_O
Toru
6 months
@PSYONICinc We have also released a comprehensive software suite that supports efficient data collection, multimodal data processing, scalable policy learning, and smooth policy deployment. See our Github repo for more details:
1
0
2
@ToruO_O
Toru
2 years
[motivation: the default learning code that came with IsaacGymEnvs was such a pain to work with] hope this could help someone else!
1
0
2
@ToruO_O
Toru
6 months
@xuxin_cheng Congrats & excited to see your next steps!
1
0
2
@ToruO_O
Toru
1 year
Check out our paper and code release to learn more about how it works, why it works, and its limitations! [w/ @ajabri ]
0
0
2
@ToruO_O
Toru
8 months
Our policy is robust against random perturbation, and adapts quickly to sustain continuous manipulation. In the video below, we randomly poke / push objects during policy deployment; our policy could reorient and translate the perturbed objects back to stable in-hand poses.
1
0
1
@ToruO_O
Toru
7 months
@Bubble_eio @ajabri Both should be possible -- you can set the isaacgym option "headless=True" to enable headless mode
1
0
0
@ToruO_O
Toru
3 years
@EugeneVinitsky Wow congrats!
1
0
1
@ToruO_O
Toru
5 months
@BoyuanChen0 Thank you Boyuan!!! Totally agree and glad to see that awesome people like @BoyuanChen0 are working on this problem too 🤜🏻🤛🏻
0
0
1
@ToruO_O
Toru
3 years
0
0
1
@ToruO_O
Toru
4 months
@GCerono @YuZhang Thanks! There are force sensors on each of the fingertips, we use the readings for training behaviors but not for teleoperation:)
0
0
1
@ToruO_O
Toru
6 months
@TairanHe99 Thank you Tairan! Maybe our robots can learn to play video games on their own in the near future 😆
0
0
1
@ToruO_O
Toru
1 year
@Cbasito1 @brenthyi Recipe: ZD10-100 flex sensors, breadboards, jumper wires, your favorite microcontroller; hand hardware and workstation; using workstation as an interchange station, take in sensor readings with Arduino and send them to the robot hand
0
0
1
@ToruO_O
Toru
6 months
@chenwang_j Thank you Chen!!!
0
0
1
@ToruO_O
Toru
4 months
0
0
1
@ToruO_O
Toru
3 years
@c20 I want to go see your cat!! ( ´ ▽ ` )
1
0
1
@ToruO_O
Toru
2 years
And there’re a bunch of other smaller details, e.g. orthogonal init for networks with different gains for hidden layers/policy output/value output, use eps 1e-5 instead of the default in Adam optimizer, not clipping value loss, using ELU over other activations, etc.
2
0
1
@ToruO_O
Toru
8 months
More videos of interesting emergent behaviors from our policy -- - skillfully adjusting the finger gaits and grasps of both hands to recover objects from unstable states back to stable poses - adapting finger movements to objects of different shapes and sizes
1
0
0
@ToruO_O
Toru
6 months
@wenlong_huang Thank you Wenlong!
0
0
1
@ToruO_O
Toru
5 months
@TairanHe99 @chatgpt4o Great work! Really impressed by the rapid progress 😄
1
0
1
@ToruO_O
Toru
6 months
@Vikashplus Thank you Vikash! Glad you find our work exciting!
0
0
1
@ToruO_O
Toru
2 years
@HedayatianSaeed I don't really have a rigorous explanation (it's DRL after all!), but Engstrom ey al. (2020) and Andrychowicz et al. (2021) also show empirical results that confirm clipping VF loss does not help or even hurts performance :)
0
0
1
@ToruO_O
Toru
8 months
@simonlc_ Thank you Simon!!
0
0
1
@ToruO_O
Toru
7 months
@Bubble_eio @ajabri Yes, that is right 😃 Please let us know if you have any further question!
1
0
1
@ToruO_O
Toru
6 months
@PSYONICinc We empirically investigate the effects of dataset size, sensing modality, and visual input preprocessing on policy learning. Results and analysis can be found in our paper:
1
0
1
@ToruO_O
Toru
2 years
Overall it’s a surprising amount of details, and I tried to bring the best of everything together, but I probably definitely also overfitted to IsaacGymEnvs tasks :) thankfully, it's very easy to tune and extend everything with this code!
1
0
1
@ToruO_O
Toru
4 months
0
0
1
@ToruO_O
Toru
1 year
@RemiCadene @brenthyi Recipe: ZD10-100 flex sensors, breadboards, jumper wires, your favorite microcontroller; hand hardware and workstation; using workstation as an interchange station, take in sensor readings with Arduino and send them to the robot hand!
1
0
1
@ToruO_O
Toru
5 months
@kenny__shaw Really looking forward!!
0
0
1