![Ramon Astudillo Profile](https://pbs.twimg.com/profile_images/1124758674931703808/fHTtaFbd_x96.jpg)
Ramon Astudillo
@RamonAstudill12
Followers
547
Following
3K
Statuses
2K
Principal RS at IBM Research AI. Speech, Formal/Natural Language Processing. Currently LLM post-training, structured SDG/RL. Opinions my own and non stationary
Manhattan, NY
Joined April 2019
RT @DimitrisPapail: We should be seriously asking, how a 1.5B model that can't answer basic questions can also be that good at competition…
0
102
0
@mblondel_ml @andrew_n_carr +1, but if you torture rejection sampling a bit, you set temperature tending to zero and approximate the scaling factor from the proposed samples, it ends up giving best-of-N (RSO paper makes this point)
0
0
2
RT @andre_t_martins: Good to see @EU_Commission promoting OS LLMs in Europe. However (1) "OpenEuroLLM" is appropriating a name (#EuroLLM) w…
0
13
0
@dearmadisonblue You mean distilling an existing diffusion into visual CoT, or training one from scratch. The latter seems the one that's hard. You will end up with a VAE anyway. There were some models like this such as DRAW.
0
0
1
RT @Yikang_Shen: It's good to see Deepseek v3 draw everyone's attention to reducing the training cost of LLM. Over the last two years, we…
0
45
0
RT @seirasto: 🌟New Benchmark! 🌟 Do you work on RAG? Are you interested in Multi-Turn conversations? Very excited to share the new MTRAG be…
0
8
0
@ch402 ☝️ So TLDR, IMO this all feels organic and tech/market driven, rather than a conscious change of style. These forces may last until AGI or maybe we hit a serious winter and we return to the old ways of more exploration less exploitation.
0
0
1
@Teknium1 ☝️That does not mean that the model never saw human CoTs. It probably saw a huge amount of high quality ones, because the initial policy must be some GPT. But, ofc, if you do some STaR-like search to that model, you are going to get better ones.
1
0
1