ml_norms Profile Banner
Ido Ben-Shaul Profile
Ido Ben-Shaul

@ml_norms

Followers
1K
Following
6K
Statuses
3K

Ido, 28, everything I find interesting in Math/ML/AI. AA-I Technologies 🦾 (Hit me up!) PhD in Appl Math @TelAvivUni.

Joined September 2018
Don't wanna be here? Send us removal request.
@ml_norms
Ido Ben-Shaul
1 year
@ylecun has been a hero of mine for more than a decade. His work introduced me to so many fields, from ConvNets, EBMs, SSL, World Models, and many more. It's an honor to present our paper at #NeurIPS2023, with the best researchers in the world. I'm so thankful and inspired🙏
@ylecun
Yann LeCun
1 year
With Ravid and Ido in front of our NeurIPS poster "reverse engineering Self-Supervised Learning"
Tweet media one
4
0
31
@ml_norms
Ido Ben-Shaul
3 hours
RT @imtiazprio: Indeed, it is that simple! The wiggliness induced by each layer allows NNs to approximate non-linear functions. More layers…
0
36
0
@ml_norms
Ido Ben-Shaul
7 hours
@ShirPeled שאלה טובה גם לכאלה שכן 😁
1
0
1
@ml_norms
Ido Ben-Shaul
8 hours
RT @bclavie: What if a [MASK] was all you needed? ModernBERT is great, but we couldn't stop wondering if it could be greater than previous…
0
102
0
@ml_norms
Ido Ben-Shaul
15 hours
RT @DimitrisPapail: o3 can't multiply beyond a few digits... But I think multiplication, addition, maze solving and easy-to-hard generaliz…
0
61
0
@ml_norms
Ido Ben-Shaul
15 hours
RT @DimitrisPapail: o3 can't multiply 10 digit numbers, but here is the acc of a 14m transformer that teaches itself how to do it, with ite…
0
62
0
@ml_norms
Ido Ben-Shaul
15 hours
RT @natolambert: Costa's just trying to make GRPO go brrr with no bugs and we're ending up with way better performance than the Tülu models…
0
17
0
@ml_norms
Ido Ben-Shaul
16 hours
RT @sama: OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5: We want to do a better job of sharing our intended roadmap, and a much better job s…
0
4K
0
@ml_norms
Ido Ben-Shaul
16 hours
RT @JeffDean: I'm delighted to have joined my good friend and colleague @NoamShazeer for a 2+hour conversation with @dwarkesh_sp about a wi…
0
186
0
@ml_norms
Ido Ben-Shaul
22 hours
RT @randall_balestr: Given a pretrained model, spline theory tells you how to alter its curvature by changing a single interpretable parame…
0
42
0
@ml_norms
Ido Ben-Shaul
24 hours
RT @Yuchenj_UW: This is wild - UC Berkeley shows that a tiny 1.5B model beats o1-preview on math by RL! They applied simple RL to Deepseek…
0
365
0
@ml_norms
Ido Ben-Shaul
2 days
RT @TaubenfeldAmir: New Preprint 🎉 LLM self-assessment unlocks efficient decoding ✅ Our Confidence-Informed Self-Consistency (CISC) metho…
0
19
0
@ml_norms
Ido Ben-Shaul
2 days
RT @pranavn1008: Announcing Matryoshka Quantization! A single Transformer can now be served at any integer precision!! In addition, our (sl…
0
82
0
@ml_norms
Ido Ben-Shaul
2 days
RT @sama: Three Observations:
0
1K
0
@ml_norms
Ido Ben-Shaul
3 days
RT @iScienceLuvr: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach We study a novel language model architect…
0
180
0
@ml_norms
Ido Ben-Shaul
4 days
RT @hkproj: Last year I made a video on Reinforcement Learning from Human Feedback, deriving the PPO loss from first principles. Thanks to…
0
37
0
@ml_norms
Ido Ben-Shaul
4 days
RT @feeelix_feng: You think on-policy sampling gives the best reward models? Think again! 🔥 Our finding: Even with on-policy data, reward m…
0
39
0
@ml_norms
Ido Ben-Shaul
5 days
RT @SFResearch: ⚡ Meet BOLT: A novel approach to develop long chain-of-thought reasoning in LLMs without relying on knowledge distillation…
0
30
0
@ml_norms
Ido Ben-Shaul
5 days
RT @_lewtun: I'm running a shit-ton of GRPO experiments on DeepSeek's distilled models with the LIMO dataset and it really works well 🔥! D…
0
36
0
@ml_norms
Ido Ben-Shaul
6 days
RT @xiangyue96: Demystifying Long CoT Reasoning in LLMs Reasoning models like R1 / O1 / O3 have gained massive atte…
0
192
0
@ml_norms
Ido Ben-Shaul
6 days
RT @BachFrancis: An inspirational talk by Michael Jordan: a refreshing, deep, and forward-looking vision for AI beyond LLMs.
0
48
0