Qinqing Zheng Profile
Qinqing Zheng

@qqyuzu

Followers
472
Following
45
Statuses
47

Reinforcement Learning, Generative Modeling @ FAIR (@AIatMeta). PhD @UChicago.

New York, NY
Joined February 2022
Don't wanna be here? Send us removal request.
@qqyuzu
Qinqing Zheng
4 months
Introducing Dualformer: a new model that integrates fast and slow thinking! By learning with randomized reasoning traces, Dualformer offers both quick response and enhanced performance with more succinct CoTs. w/ Andy Mike @tesatory @tydsh
Tweet media one
5
28
174
@qqyuzu
Qinqing Zheng
7 days
Hanlin's twitter account: @zhuhl98
0
0
0
@qqyuzu
Qinqing Zheng
7 days
RT @_akhaliq: Token Assorted Mixing Latent and Text Tokens for Improved Language Model Reasoning
Tweet media one
0
16
0
@qqyuzu
Qinqing Zheng
2 months
RT @HenaffMikael: We share our code - excited to see what people build with this! Many thanks to @qqyuzu @adityagrover_ @yayitsamyzhang @br
0
1
0
@qqyuzu
Qinqing Zheng
2 months
ONI offers concurrent policy training & reward synthesizing, a good fit for long horizon sparse reward problems! I also believe its great potential to be extended to multimodal inputs and complex planning/reasoning environments!
@brandondamos
Brandon Amos
2 months
🤔 How to extract knowledge from LLMs to train better RL agents? 📚 Our new paper (with @qqyuzu @HenaffMikael @yayitsamyzhang @adityagrover_ ) studies LLM-driven rewards for NetHack! Paper: Code:
Tweet media one
Tweet media two
Tweet media three
0
2
15
@qqyuzu
Qinqing Zheng
4 months
RT @sirbayes: Excited to share our new paper on "Diffusion Model Predictive Control" (D-MPC). Key idea: leverage diffusion models to learn…
0
72
0
@qqyuzu
Qinqing Zheng
4 months
RT @tydsh: 🚀🎯Dualformer, our simple yet novel training paradigm that leads to 1️⃣ Emergent behaviors of automatic switching between system…
0
41
0
@qqyuzu
Qinqing Zheng
8 months
RT @tungnd_13: Introducing LICO, a new black-box optimizer for arbitrary domains (including non-textual) based on large language models. A…
0
15
0
@qqyuzu
Qinqing Zheng
1 year
These works, including ours, are motivated from various perspectives and have derived different resulting algorithms. The blossom of works applying DM for world modeling (most above works appear in the last 3 months), shows the great potential of this research direction. (8/n)
1
0
2
@qqyuzu
Qinqing Zheng
1 year
Our method is on par with SOTA methods, eliminating the performance gap between model-based and model-free methods for offline RL. [4/n]
Tweet media one
0
0
2
@qqyuzu
Qinqing Zheng
1 year
Takeaway: Sequence modeling tools like Diffusion model and Transformer are better alternatives of one-step dynamics models, and Diffusion model is even better than Transformer due to the elimination of the autoregressive structure. [3/n]
Tweet media one
0
0
4
@qqyuzu
Qinqing Zheng
1 year
In particular, we propose Diffusion Model Based Value Expansion, using DWM generated sequences to facilitate value estimation. [3/n]
1
0
3