![attentionmech Profile](https://pbs.twimg.com/profile_images/1885706817327480832/_eYAhpbb_x96.png)
attentionmech
@attentionmech
Followers
237
Following
17K
Statuses
1K
swe | learning ai/ml | philomath | cs
future
Joined December 2024
@qtnx_ you can try those 1/N threads may be. but can understand the struggle with perfection on writing part.
1
0
1
@qtnx_ would you consider writing any blogs on what you built? I think you do ship cool stuff (and fast too)
1
0
3
@_trish_07 starting to like zig for simple bump up over c. tried rust but nope, too many things going on.
0
0
0
@Teknium1 intelligence explosion doesn't mean inf time abundance as well imo. you have got limited time, and what you prioritize to build and sell on that, will still matter.
1
0
0
RT @abacaj: Why does qwen2.5 0.5b base work and llama 3.2 1b base not work nearly as well? Qwen models are effectively pretrained with SFT…
0
27
0
RT @kellerjordan0: New NanoGPT-Medium speedrun record: 2.92 FineWeb val loss in 28.1 minutes on 8xH100 Previous record: 29.3 minutes Chang…
0
17
0
@tokenbender limbo is such an exp. only thing close, i have is unravel and then may be it takes two.
0
0
0
RT @unixpickle: Trying Muon for a hobby project. Blue is Adam, green is Muon. Code is pretty simple too. https://t.…
0
7
0
RT @helloiamleonie: 4 tutorials that helped me learn LLM fine-tuning in 3 weeks: 1. “How to Fine-Tune LLMs in 2024 with Hugging Face” by @…
0
273
0
RT @novasarc01: use leetgpu to practice writing cuda kernels...great ui, no gpu required, completely free!
0
76
0
RT @kalomaze: on today's episode of "you can just do things" did you know you can split the MLP projections of a dense Transformer by a gi…
0
3
0
RT @kalomaze: the elites don't want you to know this but you can train your base LLM into a classifier without initializing new lm_head pa…
0
12
0