attentionmech @attentionmech profile

attentionmech

@attentionmech

Followers

237

Following

17K

Statuses

1K

swe | learning ai/ml | philomath | cs

future

Joined December 2024

Don't wanna be here? Send us removal request.

attentionmech

@attentionmech

41 minutes

@kalomaze Satya to Sam: "No more AGI hype, focus on products"

0

1

attentionmech

@attentionmech

2 hours

@kalomaze @kalocide they can just rent those gpu here, and do betterr

0

1

attentionmech

@attentionmech

3 hours

@qtnx_ will check, thnx.

0

1

attentionmech

@attentionmech

3 hours

@qtnx_ you can try those 1/N threads may be. but can understand the struggle with perfection on writing part.

1

0

1

attentionmech

@attentionmech

3 hours

@qtnx_ would you consider writing any blogs on what you built? I think you do ship cool stuff (and fast too)

1

0

3

attentionmech

@attentionmech

4 hours

@_trish_07 starting to like zig for simple bump up over c. tried rust but nope, too many things going on.

0

attentionmech

@attentionmech

4 hours

@Teknium1 intelligence explosion doesn't mean inf time abundance as well imo. you have got limited time, and what you prioritize to build and sell on that, will still matter.

1

0

attentionmech

@attentionmech

5 hours

reading this today

0

2

attentionmech

@attentionmech

5 hours

@nearcyan captcha but to weedout scammers from DM.

0

attentionmech

@attentionmech

5 hours

RT @abacaj: Why does qwen2.5 0.5b base work and llama 3.2 1b base not work nearly as well? Qwen models are effectively pretrained with SFT…

0

27

0

attentionmech

@attentionmech

5 hours

RT @kellerjordan0: New NanoGPT-Medium speedrun record: 2.92 FineWeb val loss in 28.1 minutes on 8xH100 Previous record: 29.3 minutes Chang…

0

17

0

attentionmech

@attentionmech

6 hours

@tokenbender limbo is such an exp. only thing close, i have is unravel and then may be it takes two.

0

attentionmech

@attentionmech

6 hours

RT @unixpickle: Trying Muon for a hobby project. Blue is Adam, green is Muon. Code is pretty simple too. https://t.…

0

7

0

attentionmech

@attentionmech

11 hours

@justinskycak so that's where ive been wrong all this time.

0

attentionmech

@attentionmech

13 hours

RT @helloiamleonie: 4 tutorials that helped me learn LLM fine-tuning in 3 weeks: 1. “How to Fine-Tune LLMs in 2024 with Hugging Face” by @…

0

273

0

attentionmech

@attentionmech

13 hours

RT @novasarc01: use leetgpu to practice writing cuda kernels...great ui, no gpu required, completely free!

0

76

0

attentionmech

@attentionmech

14 hours

RT @kalomaze: on today's episode of "you can just do things" did you know you can split the MLP projections of a dense Transformer by a gi…

0

3

0

attentionmech

@attentionmech

14 hours

@remstack @theirtestuser is he talking about kache lmao?

0

attentionmech

@attentionmech

15 hours

RT @kalomaze: this is an old paper but i believe unfamiliar people should read it

0

70

0

attentionmech

@attentionmech

15 hours

RT @kalomaze: the elites don't want you to know this but you can train your base LLM into a classifier without initializing new lm_head pa…

0

12

0