AI Safety Institute
@AISafetyInst
Followers
4K
Following
6
Statuses
92
We’re building a team of world leading talent to tackle some of the biggest challenges in AI safety - come and join us.
United Kingdom
Joined February 2024
Congratulations to @YoshuaBengio and team on the publication of the International AI Safety Report. This is major milestone in building international consensus on the science of AI safety.
Today, we are publishing the first-ever International AI Safety Report, backed by 30 countries and the OECD, UN, and EU. It summarises the state of the science on AI capabilities and risks, and how to mitigate those risks. 🧵 Link to full Report: 1/16
1
7
72
New research on the open challenges for machine unlearning in AI safety. A collaboration between @UniofOxford @Mila_Quebec @AISafetyInst and many others.
🚨 New Paper Alert: Open Problem in Machine Unlearning for AI Safety 🚨 Can AI truly "forget"? While unlearning promises data removal, controlling emergent capabilities is a inherent challenge. Here's why it matters: 👇 Paper: 1/8
1
6
35
RT @SciTechgovuk: The benefits of AI will be unleashed across the UK under a new AI Opportunities Action Plan. We’re taking forward 50 re…
0
84
0
Great to see our advisory board member @matthewclifford bringing out a hugely ambitious plan to back the UK as an AI leader. We're excited to contribute to the AI Opportunities Action Plan, supporting AI safety innovation in the UK and growing the ecosystem even further.
Artificial intelligence will be unleashed across the UK under government’s game-changing AI Opportunities Action Plan. Turbocharging economic growth. Creating jobs. Making the UK the number one place for AI firms to invest. @matthewclifford explains how 👇
1
8
64
Our new technical report details the results of our pre-deployment testing of @OpenAI's o1 model with the U.S. AI Safety Institute. Read more ⬇️
4
15
76
🎉 Huge congratulations to AISI researcher @hannahrosekirk for winning a Best Paper Award at #NeurIPS2024! 🏆
A real honour and career dream that PRISM has won a @NeurIPSConf best paper award! 🌈 One year ago I was sat in a 13,000+ person audience of NeurIPs '23 having just finished data collection. Safe to say I've gone from feeling #stressed to very #blessed 😁
2
12
86
We've released a technical report detailing our pre-deployment testing of @AnthropicAI's upgraded Claude 3.5 Model with the U.S. AI Safety Institute. Read our blog for a high-level overview.
1
23
151
Our new paper on safety cases, in collaboration with @GovAI_ shows how it’s possible to write safety cases for current systems, using existing techniques. We hope to see organisations using templates like this for their models.
1
7
40
We’ve partnered with @VectorInst and Arcadia Impact to develop evals on coding, maths, cybersecurity, safeguards and more. InspectEvals includes leading benchmarks and several agent benchmarks, which can now be run against any model with a single command. 2/2
0
1
7