Negin Raoof @NeginRaoof_ profile

Negin Raoof

@NeginRaoof_

Followers

631

Following

704

Statuses

182

Ph.D. student @UCBerkeley advised by @AlexGDimakis Ex: SWE @microsoft, collaborator @PyTorch

Joined September 2022

Don't wanna be here? Send us removal request.

Negin Raoof

@NeginRaoof_

18 hours

Announcing OpenThinker-32B: the best open-data reasoning model distilled from DeepSeek-R1. Our results show that large, carefully curated datasets with verified R1 annotations produce SoTA reasoning models. Our 32B model outperforms all 32B models including DeepSeek-R1-Distill-Qwen-32B (a closed data model) in MATH500 and GPQA Diamond, and shows similar performance in other benchmarks. (1/n)

13

107

647

Negin Raoof

@NeginRaoof_

5 hours

RT @DimitrisPapail: o3 can't multiply beyond a few digits... But I think multiplication, addition, maze solving and easy-to-hard generaliz…

0

60

0

Negin Raoof

@NeginRaoof_

6 hours

RT @shiyi_c98: Thanks for sharing our work! 🚀 We've set up a multi-LoRA chat demo: http://104.171.203.80:9090/ 🎉 We trained two LoRA adapt…

0

9

0

Negin Raoof

@NeginRaoof_

6 hours

@KevinRossi Thanks! This is a valid concern, and we do consider training from Llama 70B base models in the future.

0

1

Negin Raoof

@NeginRaoof_

9 hours

@raw_works Thanks a lot for pointing out aider polyglot, we'll definitely look into that! SWEBench and BFCL are next priorities. Currently debugging our SWEBench integration.

0

Negin Raoof

@NeginRaoof_

10 hours

@hokazuya Thanks! Our eval code and logs are available: What generation parameters are you using?

1

0

3

Negin Raoof

@NeginRaoof_

10 hours

@TRRGRSA @AlexGDimakis Check out budget forcing (setting min/max number of tokens) In our experience, you need the perfect balance of ADHD and OCD for your math problems

1

0

2

Negin Raoof

@NeginRaoof_

12 hours

@TremendaCarucha We use BF16

0

Negin Raoof

@NeginRaoof_

15 hours

RT @madiator: We accidentally de-censored the model! Qwen-instruct which we use is censored and aligned. DeepSeek-R1 distilled models are…

0

22

0

Negin Raoof

@NeginRaoof_

16 hours

@HKydlicek Thanks for letting us know! Yeah we used "0.5.1". We'll definitely update and compare with 0.5.2. And also thanks for the verifier, it's amazing!

1

0

5

Negin Raoof

@NeginRaoof_

16 hours

RT @AlexGDimakis: Another interesting thing we discovered: When we post-trained OpenThinker-32B, *it removed censoring that was there in Qw…

0

2

0

Negin Raoof

@NeginRaoof_

17 hours

Please give us your comments and suggestions:

0

4

Negin Raoof

@NeginRaoof_

17 hours

@codezakh Congrats!

1

0

1

Negin Raoof

@NeginRaoof_

17 hours

RT @lschmidt3: Very nice community progress on open-data reasoning models since the R1 release!

0

1

0

Negin Raoof

@NeginRaoof_

17 hours

RT @madiator: We are making good progress here, and today, we are releasing a very competitive open 32B reasoning model called OpenThinker-…

0

4

0

Negin Raoof

@NeginRaoof_

17 hours

RT @ollama: 7B: ollama run openthinker 32B: ollama run openthinker:32b Model page 👇👇👇

0

116

0

Negin Raoof

@NeginRaoof_

17 hours

@wassollichhier No not yet! We are looking into adding function calling, and integrating BFCL into our evals.

0

5

Negin Raoof

@NeginRaoof_

18 hours

@giannis_daras Thanks! 🫶🏼

0

1

Negin Raoof

@NeginRaoof_

18 hours

RT @sedrickkeh2: 📣OpenThinker-32B is the best open-data reasoning model! We post-train Qwen2.5-32B-Instruct on our OpenThoughts-114k data…

0

1

0

Negin Raoof

@NeginRaoof_

18 hours

RT @trungthvu: Awesome work by our OpenThinker team to create the best open-data 32B reasoning model! Our model closely matches or beats…

0

2

0