gordic_aleksa Profile Banner
Aleksa Gordić (水平问题) Profile
Aleksa Gordić (水平问题)

@gordic_aleksa

Followers
22K
Following
7K
Statuses
4K

x @GoogleDeepMind @Microsoft proud father of 16 H100s flirting with LLMs, tensor core maximalist

🇺🇸 silico 🇺🇸
Joined September 2017
Don't wanna be here? Send us removal request.
@gordic_aleksa
Aleksa Gordić (水平问题)
5 months
llm.c gang w/ @karpathy, Erik, and Arun: aka "avengers" and since Andrej's yesterday talk the "4 pandas team" lol 😂
Tweet media one
Tweet media two
Tweet media three
24
29
1K
@gordic_aleksa
Aleksa Gordić (水平问题)
19 hours
@iScienceLuvr folks just enable 2 factor authentication and you're safe btw. same person reached out to me, they'll send you calendly and it'll be added to your x profile
0
0
0
@gordic_aleksa
Aleksa Gordić (水平问题)
3 days
flying to usa 🇺🇸 if any frens in Las Vegas lmk will spend some time there! :)
0
0
4
@gordic_aleksa
Aleksa Gordić (水平问题)
4 days
0
0
1
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
"let's wait step by step" here we go again, buckle up
Tweet media one
1
1
20
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
do you now realize just how out of touch the AI safety community really is? ffs they made a big deal out of GPT-2 DeepSeek drops pareto optimal 671B model and almost no one is talking about how it's going to undermine democracy, enable proliferation of nuclear weapons, and turn us into paper clips it's always been about competition (although i believe that those ~30 folks in lw forum really do believe it)
6
2
56
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
wait - is all you need?
@andrew_n_carr
Andrew Carr (e/🤸)
6 days
If, during the RL phase, you interrupt thinking and append "wait," to the reasoning traces you bend the cost curve and get within a point of R1 distill 32B with just 1k high quality examples
Tweet media one
2
3
40
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@Suhail it's the internet explorer saga all over again
0
0
0
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@Teknium1 it would be problematic for them to do it now - perception issue
0
0
1
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@karpathy "yolo coding" yc or vc anon?
0
0
3
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@giffmana is this the equivalent of DeepSeek used $5M to train their models or quite the opposite - they also priced in the budget for snacks?
0
0
1
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
create a market around this: people who want to attempt to jailbreak the system have to pay $ X and whoever jailbreaks it first gets the pool prize the group you'll attract by doing this will be sufficiently different than the volunteers red teaming it
@AnthropicAI
Anthropic
5 days
New Anthropic research: Constitutional Classifiers to defend against universal jailbreaks. We’re releasing a paper along with a demo where we challenge you to jailbreak the system.
Tweet media one
3
0
13
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@tszzl filmed this with Luke over a year ago
0
0
1
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@DavidSHolz ✋(in a month :P waiting for my visa stamp)
0
0
4
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
<3
@jeremyphoward
Jeremy Howard
5 days
How do you feel, @gordic_aleksa, that google search now defines you as #1 example of "skill issue" in Chinese?
Tweet media one
0
0
21
@gordic_aleksa
Aleksa Gordić (水平问题)
5 days
@jeremyphoward "to skill or to skill issue that is the question" <3
0
0
5
@gordic_aleksa
Aleksa Gordić (水平问题)
7 days
the original blog post "the short case for nvidia stock": the misleading tweet: medusa paper:
@orikron
Dott. Orikron 🇵🇹
16 days
🚨 Each AI engineer at Meta AI earns more every year (>$5.5 million) than it took to train China's Deepseek. Despite this, Meta's Llama model is far behind Deepseek. American engineers are now rushing to copy the methods of a small Chinese engineering group.
Tweet media one
0
0
3
@gordic_aleksa
Aleksa Gordić (水平问题)
8 days
@andrew_n_carr Congrats man! Big day! (big baby mini-model release :))
0
0
1
@gordic_aleksa
Aleksa Gordić (水平问题)
9 days
@nearcyan i interpreted in wrong context then!
0
0
3