lbeurerkellner Profile Banner
Luca Beurer-Kellner Profile
Luca Beurer-Kellner

@lbeurerkellner

Followers
332
Following
574
Statuses
168

working on secure agentic AI @invariantlabsai PhD @the_sri_lab, ETH Zürich. Also: @lmqllang and @projectlve.

Zurich, Switzerland
Joined August 2009
Don't wanna be here? Send us removal request.
@lbeurerkellner
Luca Beurer-Kellner
2 months
We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I hope they are useful and speed up all the folks building agents these days. Explorer is for trace viewing and Testing for agent unit testing. Happy Holidays!
1
3
13
@lbeurerkellner
Luca Beurer-Kellner
18 hours
very cool work
@bclavie
Benjamin Clavié
19 hours
What if a [MASK] was all you needed? ModernBERT is great, but we couldn't stop wondering if it could be greater than previous encoders in different ways. Maybe we don't need task-specific heads? Maybe it can do all sort of tasks with only its generative head? Spoilers: Yes
Tweet media one
0
0
2
@lbeurerkellner
Luca Beurer-Kellner
3 days
0
0
0
@lbeurerkellner
Luca Beurer-Kellner
3 days
@_philschmid @theo computer use capabilities would be awesome
1
0
2
@lbeurerkellner
Luca Beurer-Kellner
3 days
RT @mbalunovic: We finally have an answer to the debate over whether LLMs generalize to new math problems or they merely memorized the answ…
0
164
0
@lbeurerkellner
Luca Beurer-Kellner
13 days
Great to see Explorer come along, especially with the recent additions to support CUAs better.
@InvariantLabsAI
Invariant Labs
13 days
✨ Do you have 100s of AI agent logs and can’t navigate through them? Check out our open source Explorer tool and never get lost in log folders again 📚📚📚 Explorer (hosted): Repo:
0
0
3
@lbeurerkellner
Luca Beurer-Kellner
17 days
RT @jiayi_pirate: We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verifi…
0
1K
0
@lbeurerkellner
Luca Beurer-Kellner
18 days
Agentic systems like CUAs and Operator need novel types of supervisor mechanisms, including HIL, other models and advanced verification to be viable in an unsupervised setting. It was really great to be able to work on this together with Ani and @allhands_ai.
@InvariantLabsAI
Invariant Labs
18 days
With (web) agents on everyone's mind, check out our latest blog post (link in thread) on browser agent safety guardrails. We replicate and defend against attacks on the @allhands_ai web agent, preventing it from generating harmful content and falling for harmful requests.
Tweet media one
0
2
6
@lbeurerkellner
Luca Beurer-Kellner
18 days
RT @InvariantLabsAI: 💻Want to build your own Computer Use agent? Check our recently released mini library for using Computer Use with Inva…
0
1
0
@lbeurerkellner
Luca Beurer-Kellner
18 days
@gneubig Great initiative! We’ve been working with CUAs for a while now. It would be great to contribute to an open version.
0
0
1
@lbeurerkellner
Luca Beurer-Kellner
20 days
Computer use is one of the most underrated agentic capabilities of recent models. It addresses so many interface and integration problems in agents. And: It can automatically dismiss cookie banners. That's amazing!
@InvariantLabsAI
Invariant Labs
20 days
Releasing Computer Use + Playwright: Let a Claude agent control a browser on your machine to achieve tasks. We've been working with computer use models for a while now and just released a helper library for it. (yes it can dismiss cookie banners) Repo:
0
0
2
@lbeurerkellner
Luca Beurer-Kellner
21 days
RT @vivien000000: New blog post alert! Read on if you are interested in accelerating structured text generation techniques for context-free…
0
3
0
@lbeurerkellner
Luca Beurer-Kellner
2 months
Check out this super awesome agent testing challenge that the team has built for the holidays. Nice showcase of how test-driven agent development can be such great method to build reliable and trustworthy agent system.
@InvariantLabsAI
Invariant Labs
2 months
If you want to play with some agent tooling over the holidays, check out our Invariant Winter challenge. Can you fix Santa's agent, so all the presents are still delivered in time? Play Here:
Tweet media one
0
0
4
@lbeurerkellner
Luca Beurer-Kellner
2 months
RT @lbeurerkellner: We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I h…
0
3
0
@lbeurerkellner
Luca Beurer-Kellner
2 months
Sorry, Testing is at
0
0
3
@lbeurerkellner
Luca Beurer-Kellner
3 months
RT @InvariantLabsAI: 🤖 Something new and exciting: We have created a public registry of AI agent benchmarks! 🌐 Check it out at https://t.c…
0
2
0
@lbeurerkellner
Luca Beurer-Kellner
6 months
@md_rumpf Sounds expensive :)
1
0
1
@lbeurerkellner
Luca Beurer-Kellner
6 months
RT @florian_tramer: We're launching weekly ⛳️CTF challenges ⛳️on AI security this summer @InvariantLabsAI Go compete for a prize pool of $…
0
6
0
@lbeurerkellner
Luca Beurer-Kellner
6 months
@simonw To me, an agent has to take actions/act outside of a text response. That’s typically achieved by function calling and a connection to some real-world system.
0
0
0