![Luca Beurer-Kellner Profile](https://pbs.twimg.com/profile_images/1539897132580216835/QejKs_DF_x96.jpg)
Luca Beurer-Kellner
@lbeurerkellner
Followers
332
Following
574
Statuses
168
working on secure agentic AI @invariantlabsai PhD @the_sri_lab, ETH Zürich. Also: @lmqllang and @projectlve.
Zurich, Switzerland
Joined August 2009
We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I hope they are useful and speed up all the folks building agents these days. Explorer is for trace viewing and Testing for agent unit testing. Happy Holidays!
1
3
13
RT @mbalunovic: We finally have an answer to the debate over whether LLMs generalize to new math problems or they merely memorized the answ…
0
164
0
RT @jiayi_pirate: We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verifi…
0
1K
0
Agentic systems like CUAs and Operator need novel types of supervisor mechanisms, including HIL, other models and advanced verification to be viable in an unsupervised setting. It was really great to be able to work on this together with Ani and @allhands_ai.
With (web) agents on everyone's mind, check out our latest blog post (link in thread) on browser agent safety guardrails. We replicate and defend against attacks on the @allhands_ai web agent, preventing it from generating harmful content and falling for harmful requests.
0
2
6
RT @InvariantLabsAI: 💻Want to build your own Computer Use agent? Check our recently released mini library for using Computer Use with Inva…
0
1
0
@gneubig Great initiative! We’ve been working with CUAs for a while now. It would be great to contribute to an open version.
0
0
1
Computer use is one of the most underrated agentic capabilities of recent models. It addresses so many interface and integration problems in agents. And: It can automatically dismiss cookie banners. That's amazing!
Releasing Computer Use + Playwright: Let a Claude agent control a browser on your machine to achieve tasks. We've been working with computer use models for a while now and just released a helper library for it. (yes it can dismiss cookie banners) Repo:
0
0
2
RT @vivien000000: New blog post alert! Read on if you are interested in accelerating structured text generation techniques for context-free…
0
3
0
Check out this super awesome agent testing challenge that the team has built for the holidays. Nice showcase of how test-driven agent development can be such great method to build reliable and trustworthy agent system.
If you want to play with some agent tooling over the holidays, check out our Invariant Winter challenge. Can you fix Santa's agent, so all the presents are still delivered in time? Play Here:
0
0
4
RT @lbeurerkellner: We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I h…
0
3
0
RT @InvariantLabsAI: 🤖 Something new and exciting: We have created a public registry of AI agent benchmarks! 🌐 Check it out at https://t.c…
0
2
0
RT @florian_tramer: We're launching weekly ⛳️CTF challenges ⛳️on AI security this summer @InvariantLabsAI Go compete for a prize pool of $…
0
6
0
@simonw To me, an agent has to take actions/act outside of a text response. That’s typically achieved by function calling and a connection to some real-world system.
0
0
0