![Kexun Zhang Profile](https://pbs.twimg.com/profile_images/1756072719156682753/W0p97jqJ_x96.jpg)
Kexun Zhang
@kexun_zhang
Followers
1K
Following
2K
Statuses
528
PhD student at @LTIatCMU. Previously at @ucsbNLP, @ZJU_china. language lover.
Joined December 2021
RT @isdownapp: 🚨 Users are reporting problems with Docker Hub Registry. Is Docker Hub Registry down for you? RT if you are having issues. h…
0
14
0
@teortaxesTex do you think it's still gonna be a single model generation just controlled with a different decoding algorithm, or it's a meta generation strategy like best-of-n?
1
0
2
RT @xiangyue96: Demystifying Long CoT Reasoning in LLMs Reasoning models like R1 / O1 / O3 have gained massive atte…
0
189
0
@dhadfieldmenell how about formal verification as verifiable rewards? Is that considered a perfect reward?
1
0
2
@DongfuJiang off the top of my head: the fine-tuning part of code llama: I think there are many more.
1
1
5
RT @Yoshua_Bengio: A few reflections I had while watching this interview featuring @geoffreyhinton: It does not (or should not) really mat…
0
8
0
@anton_iades R1’s ability to write in all sorts of styles is way too overlooked. The story I heard is they hired lots of students from the top humanities programs in china to help with their data.
1
0
4
RT @lateinteraction: There are four types of research problems involving "natural language processing" that I find really fascinating. The…
0
8
0
RT @yuxiangw_cs: It's interesting to see people entering panic mode over #DeepSeekR1 and going bearish in computing. It's like going bearis…
0
5
0
RT @pthangeda_: @DanHendrycks @Miles_Brundage @dwarkesh_sp @polynoamial has been publicly saying that in his talks. I don’t know what can b…
0
1
0
this↓
@DimitrisPapail @rm_rafailov But if it's about the model, not the algorithm/process, that conversation/ablation would be very illuminating. Someone should just re-do basic STaR on a few different recent base/instruct models and show what happens.
0
0
3