![Tejas Vaidhya Profile](https://pbs.twimg.com/profile_images/1538127491914731521/P4qB0vkK_x96.jpg)
Tejas Vaidhya
@imtejas13
Followers
356
Following
3K
Statuses
563
Building @NolanoOrg | Curious about everything | MSc student @Mila_Quebec. Previously: @iitkgp, @ETH_en, @ZFellows_
Montréal, Québec
Joined April 2020
@ArnabMondal96 @_AyushKaushal @irinarish @iclr_conf - Most of those reviews are vague and uninformative. Additionally, some of them request unreasonable training of the 4B+ model until convergence, which will not happen until 11 trillion tokens.
0
0
0
RT @benjamintherien: Learned optimizers can’t generalize to large unseen tasks…. Until now! Excited to present μLO: Compute-Efficient Meta-…
0
33
0
RT @irinarish: @BensenHsu @sama Indeed, memory is becoming a bottleneck, but there is lots of work on quantization/compression and also tra…
0
2
0
RT @ArnabMondal96: Looking for PhD interns for Summer 2025 with strong publication records. Experience with video-language foundation model…
0
69
0
There is more to it. The scaling curve you forgot to talk about. I will be archiving the full report soon.
Deepsilicon runs neural nets with 5x less RAM and ~20x faster. They are building SW and custom silicon for it. What’s interesting is that they have proved it with SW, and you can even try it. On why we funded them 1/7
0
0
4
RT @GCResearchTeam: Spectra by @NolanoOrg is an open suite of 54 LLMs spanning FP16 training, ternary training, and post-training quantisat…
0
2
0
RT @BlancheMinerva: Very cool paper that shows impressive performance with ternary LLMs. Discovering new papers that use @AiEleuther's GPT-…
0
3
0