We are excited to share three major updates today on our path to mathematical superintelligence 🦾
1. A new state-of-the-art of 90% on the MiniF2F benchmark. This beats our previously announced 83% from under a month ago.
🦾We are excited to share Aristotle, an automated theorem prover that advances the state of art on MiniF2F, the standard dataset of formal mathematics problems statements. /1
If you are excited about helping us forge the world’s most advanced mathematical reasoning engine, we are hiring AI researchers, mathematicians, and distributed systems experts. Check out our open roles at ! /3
Aristotle currently achieves a 63% success rate at producing full Lean proofs, and a 83% proof rate with computer algebra system assistance. Learn more at /2