Ido Ben-Shaul @ml_norms profile

Ido Ben-Shaul

@ml_norms

Followers

1K

Following

6K

Statuses

3K

Ido, 28, everything I find interesting in Math/ML/AI. AA-I Technologies 🦾 (Hit me up!) PhD in Appl Math @TelAvivUni.

Joined September 2018

Don't wanna be here? Send us removal request.

Ido Ben-Shaul

@ml_norms

1 year

@ylecun has been a hero of mine for more than a decade. His work introduced me to so many fields, from ConvNets, EBMs, SSL, World Models, and many more. It's an honor to present our paper at #NeurIPS2023, with the best researchers in the world. I'm so thankful and inspired🙏

Yann LeCun

@ylecun

1 year

With Ravid and Ido in front of our NeurIPS poster "reverse engineering Self-Supervised Learning"

4

0

31

Ido Ben-Shaul

@ml_norms

3 hours

RT @imtiazprio: Indeed, it is that simple! The wiggliness induced by each layer allows NNs to approximate non-linear functions. More layers…

0

36

0

Ido Ben-Shaul

@ml_norms

7 hours

@ShirPeled שאלה טובה גם לכאלה שכן 😁

1

0

1

Ido Ben-Shaul

@ml_norms

8 hours

RT @bclavie: What if a [MASK] was all you needed? ModernBERT is great, but we couldn't stop wondering if it could be greater than previous…

0

102

0

Ido Ben-Shaul

@ml_norms

15 hours

RT @DimitrisPapail: o3 can't multiply beyond a few digits... But I think multiplication, addition, maze solving and easy-to-hard generaliz…

0

61

0

Ido Ben-Shaul

@ml_norms

15 hours

RT @DimitrisPapail: o3 can't multiply 10 digit numbers, but here is the acc of a 14m transformer that teaches itself how to do it, with ite…

0

62

0

Ido Ben-Shaul

@ml_norms

15 hours

RT @natolambert: Costa's just trying to make GRPO go brrr with no bugs and we're ending up with way better performance than the Tülu models…

0

17

0

Ido Ben-Shaul

@ml_norms

16 hours

RT @sama: OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5: We want to do a better job of sharing our intended roadmap, and a much better job s…

0

4K

0

Ido Ben-Shaul

@ml_norms

16 hours

RT @JeffDean: I'm delighted to have joined my good friend and colleague @NoamShazeer for a 2+hour conversation with @dwarkesh_sp about a wi…

0

186

0

Ido Ben-Shaul

@ml_norms

22 hours

RT @randall_balestr: Given a pretrained model, spline theory tells you how to alter its curvature by changing a single interpretable parame…

0

42

0

Ido Ben-Shaul

@ml_norms

24 hours

RT @Yuchenj_UW: This is wild - UC Berkeley shows that a tiny 1.5B model beats o1-preview on math by RL! They applied simple RL to Deepseek…

0

365

0

Ido Ben-Shaul

@ml_norms

2 days

RT @TaubenfeldAmir: New Preprint 🎉 LLM self-assessment unlocks efficient decoding ✅ Our Confidence-Informed Self-Consistency (CISC) metho…

0

19

0

Ido Ben-Shaul

@ml_norms

2 days

RT @pranavn1008: Announcing Matryoshka Quantization! A single Transformer can now be served at any integer precision!! In addition, our (sl…

0

82

0

Ido Ben-Shaul

@ml_norms

2 days

RT @sama: Three Observations:

0

1K

0

Ido Ben-Shaul

@ml_norms

3 days

RT @iScienceLuvr: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach We study a novel language model architect…

0

180

0

Ido Ben-Shaul

@ml_norms

4 days

RT @hkproj: Last year I made a video on Reinforcement Learning from Human Feedback, deriving the PPO loss from first principles. Thanks to…

0

37

0

Ido Ben-Shaul

@ml_norms

4 days

RT @feeelix_feng: You think on-policy sampling gives the best reward models? Think again! 🔥 Our finding: Even with on-policy data, reward m…

0

39

0

Ido Ben-Shaul

@ml_norms

5 days

RT @SFResearch: ⚡ Meet BOLT: A novel approach to develop long chain-of-thought reasoning in LLMs without relying on knowledge distillation…

0

30

0

Ido Ben-Shaul

@ml_norms

5 days

RT @_lewtun: I'm running a shit-ton of GRPO experiments on DeepSeek's distilled models with the LIMO dataset and it really works well 🔥! D…

0

36

0

Ido Ben-Shaul

@ml_norms

6 days

RT @xiangyue96: Demystifying Long CoT Reasoning in LLMs Reasoning models like R1 / O1 / O3 have gained massive atte…

0

192

0

Ido Ben-Shaul

@ml_norms

6 days

RT @BachFrancis: An inspirational talk by Michael Jordan: a refreshing, deep, and forward-looking vision for AI beyond LLMs.

0

48

0