katie_kang_ Profile Banner
Katie Kang Profile
Katie Kang

@katie_kang_

Followers
2K
Following
2K
Statuses
90

PhD student @berkeley_ai

Joined August 2020
Don't wanna be here? Send us removal request.
@katie_kang_
Katie Kang
3 months
LLMs excel at fitting finetuning data, but are they learning to reason or just parrotingšŸ¦œ? We found a way to probe a model's learning process to reveal *how* each example is learned. This lets us predict model generalization using only training data, amongst other insights: šŸ§µ
Tweet media one
19
118
755
@katie_kang_
Katie Kang
18 days
The deepseek R1 recipe seems so simple. Iā€™ve been wondering whatā€™s changed from previous RL+reasoning efforts, and found this thread insightful
@its_dibya
Dibya Ghosh
18 days
With R1, a lot of people have been asking ā€œhow come we didn't discover this 2 years ago?ā€ Well... 2 years ago, I spent 6 months working exactly on this (PG / PPO for math+gsm8k), but my results were nowhere as good. Hereā€™s my take on what blocked me and whatā€™s changed: šŸ§µ
0
4
25
@katie_kang_
Katie Kang
22 days
RT @aviral_kumar2: šŸšØ We are organizing an ICLR workshop on self-improving foundation models w/o human supervision at ICLR 2025 in Singaporeā€¦
0
19
0
@katie_kang_
Katie Kang
3 months
RT @sea_snell: Can we predict emergent capabilities in GPT-N+1šŸŒŒ using only GPT-N model checkpoints, which have random performance on the taā€¦
0
71
0
@katie_kang_
Katie Kang
3 months
RT @j_foerst: Learnability for the win! This is one of the lessons that transfers from Curriculum methods in RL directly to LLM training.
0
3
0
@katie_kang_
Katie Kang
3 months
@xordrew @setlur_amrith @its_dibya @JacobSteinhardt @svlevine @aviral_kumar2 Potentially because the model has less incentive to change if it has already achieved low loss, though I think thereā€™s some prior work that shows it can happen sometimes if you train for a really long time,e.g. in
0
0
3
@katie_kang_
Katie Kang
3 months
@chaochunh Ohh yes youā€™re right, thanks for catching this!!
0
0
2
@katie_kang_
Katie Kang
3 months
@BlancheMinerva Omg thank you!! Will definitely reach out if we end up pursuing this direction šŸ˜Š
0
0
2
@katie_kang_
Katie Kang
3 months
RT @aviral_kumar2: Check out @katie_kang_'s work on understanding memorization vs learning in reasoning! By probing LLMs in training, weā€¦
0
8
0
@katie_kang_
Katie Kang
3 months
@chaochunh It's an indicator variable in MaskedAcc, so 1 if perp > p (not memorized) and 0 if perp < p (memorized)
1
0
1
@katie_kang_
Katie Kang
3 months
@VarunGodbole @setlur_amrith @its_dibya @JacobSteinhardt @svlevine @aviral_kumar2 With a suboptimal training setup (e.g. hyperparams), models can sometimes directly memorize examples that they may otherwise learn more generalizably if the training setup was better. So there's a tradeoff between direct memorization vs generalizable learning before memorization
0
0
0
@katie_kang_
Katie Kang
3 months
@VarunGodbole @setlur_amrith @its_dibya @JacobSteinhardt @svlevine @aviral_kumar2 Once models learn to generate diverse+correct CoTs to a train example, they tend to retain the ability to generate robust predictions, regardless of whether it memorizes the example later in training
1
0
2
@katie_kang_
Katie Kang
3 months
@SadhikaMalladi Thanks for sharing! I haven't seen this before
0
0
2
@katie_kang_
Katie Kang
3 months
@ahatamiz1 Thanks! We haven't, but it would definitely be interesting to better understand the relationship btw architecture and generalization
1
0
7
@katie_kang_
Katie Kang
3 months
@chaochunh 1) yes! in calculating pre-mem acc, we mask out accuracy when perplexity is too low 2) it makes the scale of numbers slightly easier to work with
0
0
2
@katie_kang_
Katie Kang
3 months
@BlancheMinerva Thanks! It would definitely be interesting to study the learning dynamics of pretraining as well
1
0
2
@katie_kang_
Katie Kang
3 months
@oier_mees Thanks Oier!
0
0
1
@katie_kang_
Katie Kang
3 months
RT @avisingh599: Exciting couple of days for reasoning research: Procedural Knowledge in Pretraining Drives Reasoning in Large Language Moā€¦
0
23
0
@katie_kang_
Katie Kang
3 months
RT @svlevine: An intriguing new result from @katie_kang_: after training long enough, LLMs will reproduce training examples exactly (not suā€¦
0
60
0
@katie_kang_
Katie Kang
3 months
@brianckwu Thank you :)
0
0
0