![Anton Lozhkov Profile](https://pbs.twimg.com/profile_images/1191631044027592704/SiNRBFVA_x96.jpg)
Anton Lozhkov
@anton_lozhkov
Followers
2K
Following
2K
Statuses
425
Open-sourcing Language Models @huggingface ✨
Joined January 2015
why would someone bookmark this 😭
@Promptmethus Math is the gateway drug, we're moving on to harder stuff this week
1
0
3
@nooriefyi Having enough VRAM for the KV cache, to generate long responses at reasonable batch sizes! @lmsysorg SGLang was essential with their MLA support
0
0
11
RT @JiaLi52524397: 🚀 NuminaMath 1.5 is here! 🚀 900k+ high-quality competition math problems with CoT solutions, new problem metadata, manua…
0
68
0
RT @LoubnaBenAllal1: We just published the second OpenR1 update with OpenR1-220k-Math, our new large-scale dataset for mathematical reasoni…
0
60
0
RT @simonw: Today I found out about SmolLM2-135M-Instruct, a tiny LLM which quantizes down to just below 100MB... which means it can fit in…
0
64
0
RT @LoubnaBenAllal1: The wait is over: our SmolLM2 paper is out—a detailed guide for building SOTA small LMs. While most LM papers skim ove…
0
104
0
RT @kimmonismus: Within 24 hours, OpenAI's Deep Research has been replicated by an open-source version that already scores 54% on the same…
0
737
0
RT @carrigmat: Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization…
0
4K
0
RT @edwardbeeching: As part of our open reproduction of R1, we have roughly reproduced DeepSeek's MATH-500 eval numbers with Hugging Face's…
0
116
0
RT @lvwerra: We're just a few weeks away from having a fully open pipeline of R1 and everybody who can rent some GPUs can train their own v…
0
20
0
RT @QGallouedec: Last moments of closed-source AI 🪦 : Hugging Face is openly reproducing the pipeline of 🐳 DeepSeek-R1. Open data, open tr…
0
435
0
RT @andimarafioti: Smol but mighty: • 256M delivers 80% of the performance of our 2.2B model. • 500M hits 90%. Both beat our SOTA 80B model…
0
8
0
RT @andimarafioti: Introducing the smollest VLMs yet! 🤏 SmolVLM (256M & 500M) runs on <1GB GPU memory. Fine-tune it on your laptop and run…
0
122
0
RT @HKydlicek: 🚀 We've boosted MATH benchmark scores for popular models by 65% —no training or model changes needed! The secret? Math-Ve…
0
62
0