gabemukobi Profile Banner
Gabriel Mukobi Profile
Gabriel Mukobi

@gabemukobi

Followers
839
Following
2K
Statuses
565

U.S. AI Safety Institute @NIST, CS PhD @Berkeley_AI | Safe and secure advanced AI. Opinions are my own.

San Francisco, CA
Joined September 2017
Don't wanna be here? Send us removal request.
@gabemukobi
Gabriel Mukobi
25 days
RT @EdelmanBen: 1/ Excited to share a new blog post from the U.S. AI Safety Institute! AI agents are becoming increasingly capable. But th…
0
27
0
@gabemukobi
Gabriel Mukobi
2 months
@RylanSchaeffer Woah what were you running for 46 days? 👀
0
0
5
@gabemukobi
Gabriel Mukobi
2 months
@sea_snell I expect 2. very roughly predicts nonzero APPS Pass@1 for Llama 3.1 405B zero shot--have you evaluated it on apps as well as this language modeling loss score to check your prediction?
0
0
0
@gabemukobi
Gabriel Mukobi
3 months
The U.S. AI Safety Institute is hiring technical staff! We're seeking talented engineers, scientists, or security experts for frontier testing, technical guidance, and more—these roles are only open for a couple more days, so apply or DM recommendations!
0
0
14
@gabemukobi
Gabriel Mukobi
4 months
RT @alxndrdavies: Jailbreaking evals ~always focus on simple chatbots—excited to announce AgentHarm, a dataset for measuring harmfulness of…
0
39
0
@gabemukobi
Gabriel Mukobi
5 months
RT @ben_s_bucknall: Excited to share our new website on Open Problems in Technical AI Governance, a companion to our recent paper on the to…
0
13
0
@gabemukobi
Gabriel Mukobi
5 months
RT @AISafetyInst: Jade Leung (our CTO) and @geoffreyirving (our Research Director) have been nominated in the @TIME top 100 most influentia…
0
8
0
@gabemukobi
Gabriel Mukobi
6 months
RT @BrandoHablando: Gave a 60 min lightning talk at Stanford's AI Safety Annual Meeting on our work identifying novel factors that explain…
0
2
0
@gabemukobi
Gabriel Mukobi
6 months
@BogdanIonutCir2 Possibly, though I'd expect directly weaponizing dangerous capabilities is easier than repurposing those capabilities for safety. A crux is whether one expects AI systems will be very useful hackers or capability researchers before they're very good safety/alignment researchers.
1
0
1
@gabemukobi
Gabriel Mukobi
6 months
@maxwellazoury @DanHendrycks That might be a useful property for certain models to have--imagine hardened models you want to share with pre-release testers or deploy in not-very-secure datacenters but not be as worried about unexpected harms from malicious finetuning if the model leaks to bad actors
0
0
0
@gabemukobi
Gabriel Mukobi
6 months
@MotionTsar So it tends to assume coins are fairer than they actually are? This is acceptable 🙃
1
0
0
@gabemukobi
Gabriel Mukobi
6 months
RT @The_JBernardi: 🚀 New blog! Achieving AI Resilience: Exploring AI safety through a lens of adaptation & societal resilience. Advanced A…
0
5
0
@gabemukobi
Gabriel Mukobi
6 months
RT @Kurz_Gesagt: Humanity's smartest invention might also be its last. Superintelligent AI could be our dream come true – or our worst nigh…
0
133
0
@gabemukobi
Gabriel Mukobi
6 months
RT @METR_Evals: How well can LLM agents complete diverse tasks compared to skilled humans? Our preliminary results indicate that our baseli…
0
99
0
@gabemukobi
Gabriel Mukobi
6 months
@inner_treasure Quite possibly! I do acknowledge this in the post, but I also expect the most significant principal component accounts for a lot of the variance such that "general capabilities" is a useful enough thing to talk about, at least for the purposes of this piece
Tweet media one
0
0
0
@gabemukobi
Gabriel Mukobi
6 months
@nitarshan @AISafetyInst Congrats, and thank you for your service!🫡
0
0
2
@gabemukobi
Gabriel Mukobi
6 months
🔗Consider reading the full post for many more details, references, considerations, and AI governance ideas. How do you conceptualize the progression of AGI development? Blog post: Discussion forum: 15/15
0
0
1