Fan-Yun Sun Profile
Fan-Yun Sun

@sunfanyun

Followers
824
Following
125
Statuses
107

cs phd candidate @StanfordAILab @stanfordsvl @NVIDIAAI (3D) vision/graphics, embodied AI

Stanford, CA
Joined October 2018
Don't wanna be here? Send us removal request.
@sunfanyun
Fan-Yun Sun
5 months
Training RL/robot policies requires extensive experience in the target environment, which is often difficult to obtain. How can we “distill” embodied policies from foundational models? Introducing FactorSim! #NeurIPS2024 We show that by generating prompt-aligned simulations and training a policy on them without collecting any experience in the target environment, we can achieve zero-shot performance close to policies trained on millions of target environment experiences in many classic RL environments. You can generate RL simulations on our project website: More in 🧵 1/7
2
44
212
@sunfanyun
Fan-Yun Sun
5 days
@aryxnsharma Been 🥘 something
0
0
1
@sunfanyun
Fan-Yun Sun
9 days
@io_nathaniel @cbames @io_nathaniel I have some ideas for a cool colab but can't DM you -- shoot me a message
0
0
1
@sunfanyun
Fan-Yun Sun
19 days
RT @heyyalexwang: did you know you've been doing test-time learning this whole time? transformers, SSMs, RNNs, are all test-time regressor…
0
109
0
@sunfanyun
Fan-Yun Sun
21 days
0
0
1
@sunfanyun
Fan-Yun Sun
2 months
I think an intuitive way to explain o1/o3 is that the models are taught to be "self-consistent" through RL. Humans are not self-consistent, often jumping to contradictory conclusions (especially on the internet). LLMs end up being suboptimal after being trained on data with these incomplete or flawed reasoning paths. Can this sort of test-time compute scale beyond the data we have today? My best guess is that it can, especially in domains where "being a verifier is easier than being a solver/generator" (e.g., code, ARC). If a model can verify its own hypotheses, it can be trained to maintain self-consistency, enabling it to generate more accurate answers. This reminds me of those neuroscience/biomedial studies suggesting that our brains stop developing after age 30. If that's true, our intellectual growth after 30 doesn't come from an improvement over the "base model", but from learning how to think more rigorously and coherently.
0
0
9
@sunfanyun
Fan-Yun Sun
2 months
RT @jiaman01: 🤖 Introducing Human-Object Interaction from Human-Level Instructions! First complete system that generates physically plausib…
0
111
0
@sunfanyun
Fan-Yun Sun
2 months
Check us out at NeurIPS tomorrow! Unfortunately I can’t be there but @locross and Jonathan will present at East Exhibit Hall A-C
@sunfanyun
Fan-Yun Sun
5 months
Training RL/robot policies requires extensive experience in the target environment, which is often difficult to obtain. How can we “distill” embodied policies from foundational models? Introducing FactorSim! #NeurIPS2024 We show that by generating prompt-aligned simulations and training a policy on them without collecting any experience in the target environment, we can achieve zero-shot performance close to policies trained on millions of target environment experiences in many classic RL environments. You can generate RL simulations on our project website: More in 🧵 1/7
0
2
5
@sunfanyun
Fan-Yun Sun
2 months
RT @nickhaber: At #NeurIPS! Anyone who’d like to chat, please reach out! I like curiosity and exploration, reasoning and self-improvement,…
0
3
0
@sunfanyun
Fan-Yun Sun
2 months
It’s widely believed that most pixels will be generated in a few years. I think it may be more accurate to say that most pixels will be generatively rendered because high-quality content almost always requires a "graphics" representation/control layer for precision. Here are some of my favorite examples by @MartinNebelong along this line of thought:
0
0
1
@sunfanyun
Fan-Yun Sun
2 months
RT @ItzSuds: I stopped tweeting 4 years ago because I had to build a company and Twitter wasn’t the real world. Turns out Twitter is the r…
0
22
0
@sunfanyun
Fan-Yun Sun
3 months
@pretendsmarts yeah true Edify’s model doesn’t *natively* generate quad mesh either
0
1
5