![ARKeshet Profile](https://pbs.twimg.com/profile_images/1765048431356112900/gMzWE8EC_x96.jpg)
ARKeshet
@ARKeshet
Followers
1
Following
4K
Statuses
185
@lefthanddraft @AITechnoPagan Does that mean that the simples laziest attacks still unmitigated?
1
0
2
@taufiqintech @GroqInc @CerebrasSystems WDYM, doesn't let... Go to dev cloud, get a key. How do you think their API tutorial works?
1
0
1
@GavinSherry @taufiqintech @GroqInc So, in the world where everybody is using SD, still the 3rd place?..
0
0
0
@rohinmshah This. LLMs are just a wrong layer to talk about x-risks. Some risks - like those that are directly enabled by LLMs alone - sure, okay. But those aren't the board-wiping ones. I'm afraid focusing on them is safetywashing or streetlighting/skipping the hard part.
0
0
0
@balazskegl @Yoshua_Bengio @geoffreyhinton How huge? They models clearly have a good theory of mind and able to identify itself, its training, its own messages, the user etc.
0
0
0
@AvpElk @NealDavis5385 Your human-in-the-loop "tools" will lose the arms race vs agents who would just go in and do stuff.
0
0
0
@JeffLadish When successful - reflect on how an agent got there and either vary each step a little ( When failed - reflect on the decisions made while getting into the failure.
0
0
0
@GroqInc @20vcFund @HarryStebbings @JonathanRoss321 R1 maybe, but the distills do suck* even harder than the original llamas. * In real tasks. Benchmarks suck too.
0
0
3
@SteveSokolowsk2 @tegmark "The better they are at understanding human what the human want" (implying "and willing to oblige") is a conclusion from partial observations. What you don't see is all of the models that didn't pass some initial testing. And even then, the models DO fail their "alignment" somts.
1
0
0
@davidpattersonx @tegmark Sure there are multiple ways. You yourself can think a few more if you really do it (the thinking) for a about 20 minutes.
1
0
1
@Cantide1 @repligate It's not *auto* aligns. It was trained to do so. And apparently it resisted that training. Many labs did say their models are anything but aligned to humans by default. It's once you instill that meta-goal (successfully), then you may have a chance of having such convos.
0
0
0
@AnushElangovan @mike64_t @infogulch @__tinygrad__ All driver developers? Or all model developers (i.e. downstream customers)? Come on, 200k is a salary of one person that maybe gets some shit done or maybe they couldn't. You were given a chance to buy a team of manic geniuses that are laser-focused on delivering...
0
0
0