![Hanze Dong Profile](https://pbs.twimg.com/profile_images/1687961386306109440/jzmqIaMe_x96.jpg)
Hanze Dong
@hendrydong
Followers
329
Following
291
Statuses
184
Research Scientist @SFResearch | Reproducibility & interpretability of LLMs | Sampling Algorithm | Core author of LMFlow, RLHFlow, Iterative SFT/DPO (RAFT/GSHF)
Joined June 2011
RT @SFResearch: ⚡ Meet BOLT: A novel approach to develop long chain-of-thought reasoning in LLMs without relying on knowledge distillation…
0
30
0
😂😂😂
NVIDIA and CMU presents ASAP, which enables highly agile motions that were previously difficult to achieve! @Cristiano Siuuuuuuu!
0
0
1
RT @baohao_liao: Impressed by DeepSeek-R1 and o3? However, they are long-reasoning models, and generate >4k tokens quite often for hard que…
0
5
0
Many thanks to all the collaborators @baohao_liao @xyh6666 @LiJunnan0409 @c_monz @silviocinguetta @doyensahoo @CaimingXiong
1
0
4
RT @JacobSteinhardt: In 2021, our research group released the MATH dataset. In the paper, we attribute the data to math contests released b…
0
13
0
RT @rosstaylor90: “Wait that can’t be right” in the wild Thank you internet anons for your service to LLM reasoning. We found you through…
0
19
0
RT @Ber18791531: It's exciting to see Kimi-k1.5 uses a similar RL training objective to our (response-level) OREO! The difference is they…
0
26
0
Interesting
Tier 1: - Sequoia - Founders Fund - A16Z - YC Tier 2: - Benchmark - General Catalyst - Khosla - Lightspeed VP - Index - Kleiner Perkins - Caffeinated Capital - SV Angel - Tiger Global - First Round - Greenoaks - Accel - Bessemer - Greylock - USV - Paradigm - Homebrew - Form Cap - Menlo - Craft Tier 3 & beyond: - Everyone else Unranked: - There are some newer unranked funds that have a lot to prove that I wouldn’t include in Tier 3.
0
0
1
RT @danielhanchen: Cool things from DeepSeek v3's paper: 1. Float8 uses E4M3 for forward & backward - no E5M2 2. Every 4th FP8 accumulate…
0
256
0
RT @jiayq: In 2019 I had a chat with the DeepSeek team, in the hope of selling them an AI cloud solution. I was trying to convince them a f…
0
123
0
RT @deepseek_ai: 🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 AP…
0
2K
0
RT @gm8xx8: Offline Reinforcement Learning for LLM Multi-Step Reasoning OREO (Offline Reasoning Optimization) is introduced to enhance the…
0
6
0