attentionmech Profile Banner
attentionmech Profile
attentionmech

@attentionmech

Followers
237
Following
17K
Statuses
1K

swe | learning ai/ml | philomath | cs

future
Joined December 2024
Don't wanna be here? Send us removal request.
@attentionmech
attentionmech
41 minutes
@kalomaze Satya to Sam: "No more AGI hype, focus on products"
0
0
1
@attentionmech
attentionmech
2 hours
@kalomaze @kalocide they can just rent those gpu here, and do betterr
0
0
1
@attentionmech
attentionmech
3 hours
@qtnx_ will check, thnx.
0
0
1
@attentionmech
attentionmech
3 hours
@qtnx_ you can try those 1/N threads may be. but can understand the struggle with perfection on writing part.
1
0
1
@attentionmech
attentionmech
3 hours
@qtnx_ would you consider writing any blogs on what you built? I think you do ship cool stuff (and fast too)
1
0
3
@attentionmech
attentionmech
4 hours
@_trish_07 starting to like zig for simple bump up over c. tried rust but nope, too many things going on.
0
0
0
@attentionmech
attentionmech
4 hours
@Teknium1 intelligence explosion doesn't mean inf time abundance as well imo. you have got limited time, and what you prioritize to build and sell on that, will still matter.
1
0
0
@attentionmech
attentionmech
5 hours
reading this today
Tweet media one
0
0
2
@attentionmech
attentionmech
5 hours
@nearcyan captcha but to weedout scammers from DM.
0
0
0
@attentionmech
attentionmech
5 hours
RT @abacaj: Why does qwen2.5 0.5b base work and llama 3.2 1b base not work nearly as well? Qwen models are effectively pretrained with SFT…
0
27
0
@attentionmech
attentionmech
5 hours
RT @kellerjordan0: New NanoGPT-Medium speedrun record: 2.92 FineWeb val loss in 28.1 minutes on 8xH100 Previous record: 29.3 minutes Chang…
0
17
0
@attentionmech
attentionmech
6 hours
@tokenbender limbo is such an exp. only thing close, i have is unravel and then may be it takes two.
0
0
0
@attentionmech
attentionmech
6 hours
RT @unixpickle: Trying Muon for a hobby project. Blue is Adam, green is Muon. Code is pretty simple too. https://t.…
0
7
0
@attentionmech
attentionmech
11 hours
@justinskycak so that's where ive been wrong all this time.
0
0
0
@attentionmech
attentionmech
13 hours
RT @helloiamleonie: 4 tutorials that helped me learn LLM fine-tuning in 3 weeks: 1. “How to Fine-Tune LLMs in 2024 with Hugging Face” by @…
0
273
0
@attentionmech
attentionmech
13 hours
RT @novasarc01: use leetgpu to practice writing cuda kernels...great ui, no gpu required, completely free!
Tweet media one
0
76
0
@attentionmech
attentionmech
14 hours
RT @kalomaze: on today's episode of "you can just do things" did you know you can split the MLP projections of a dense Transformer by a gi…
0
3
0
@attentionmech
attentionmech
14 hours
@remstack @theirtestuser is he talking about kache lmao?
0
0
0
@attentionmech
attentionmech
15 hours
RT @kalomaze: this is an old paper but i believe unfamiliar people should read it
Tweet media one
0
70
0
@attentionmech
attentionmech
15 hours
RT @kalomaze: the elites don't want you to know this but you can train your base LLM into a classifier without initializing new lm_head pa…
0
12
0