![samsja Profile](https://pbs.twimg.com/profile_images/1513761647009177603/hyRXz37w_x96.jpg)
samsja
@samsja19
Followers
3K
Following
6K
Statuses
2K
training LLM across the globe at @PrimeIntellect
sf
Joined March 2020
Intellect 1 is out. It's a 10B model trained across 3 continents using 100+ H100s, with 30 individual compute contributors. The evals are good (for 1T tokens), and the model is live. I can't stress how important this release is for open-source AI. Decentralized training is the only path toward sovereign open-source foundation models. This release proves that it's not just a fairy tale - it's working, and it's just the beginning.
Releasing INTELLECT-1: We’re open-sourcing the first decentralized trained 10B model: - INTELLECT-1 base model & intermediate checkpoints - Pre-training dataset - Post-trained instruct models by @arcee_ai - PRIME training framework - Technical paper with all details
9
54
356
RT @johannes_hage: great work! very interesting approach to initially limit the context length so the model learns to utilize it more effe…
0
1
0
epic what ppl can do with a bit of compute
i'm open-sourcing "smolattn" a minimal implementation of the flash attention 1 algorithm in CUDA that is almost 6 times faster than PyTorch's manual implementation (for small sequence lengths upto 1024). the entire kernel is less than 200 lines of code.
2
0
26
RT @LoubnaBenAllal1: We just published the second OpenR1 update with OpenR1-220k-Math, our new large-scale dataset for mathematical reasoni…
0
45
0
@AvpElk @GaryMarcus What does pure LLM even mean ? Again one should not make the confusion between architecture (LLM) and objective (next token prediction, rl ,...)
1
0
1
@TimDarcet @giffmana what if it was doing rl indefinitely ? As in the inference is just part of the RL process
0
0
0
@aryaman2020 @jeremyphoward I was making the argument this approach would be the best way to scale reasoning. I have been proven wrong since tho
0
0
5
@Dorialexander @dylan522p @ChrSzegedy But again, I am not picking a camp or trying to say who is right or wrong. But one needs to take the scaling part into perspective
0
0
1