Luca Beurer-Kellner @lbeurerkellner profile

Luca Beurer-Kellner

@lbeurerkellner

Followers

332

Following

574

Statuses

168

working on secure agentic AI @invariantlabsai PhD @the_sri_lab, ETH Zürich. Also: @lmqllang and @projectlve.

Zurich, Switzerland

Joined August 2009

Don't wanna be here? Send us removal request.

Luca Beurer-Kellner

@lbeurerkellner

2 months

We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I hope they are useful and speed up all the folks building agents these days. Explorer is for trace viewing and Testing for agent unit testing. Happy Holidays!

1

3

13

Luca Beurer-Kellner

@lbeurerkellner

18 hours

very cool work

Benjamin Clavié

@bclavie

19 hours

What if a [MASK] was all you needed? ModernBERT is great, but we couldn't stop wondering if it could be greater than previous encoders in different ways. Maybe we don't need task-specific heads? Maybe it can do all sort of tasks with only its generative head? Spoilers: Yes

0

2

Luca Beurer-Kellner

@lbeurerkellner

3 days

@_philschmid @theo 👀

0

Luca Beurer-Kellner

@lbeurerkellner

3 days

@_philschmid @theo computer use capabilities would be awesome

1

0

2

Luca Beurer-Kellner

@lbeurerkellner

3 days

RT @mbalunovic: We finally have an answer to the debate over whether LLMs generalize to new math problems or they merely memorized the answ…

0

164

0

Luca Beurer-Kellner

@lbeurerkellner

13 days

Great to see Explorer come along, especially with the recent additions to support CUAs better.

Invariant Labs

@InvariantLabsAI

13 days

✨ Do you have 100s of AI agent logs and can’t navigate through them? Check out our open source Explorer tool and never get lost in log folders again 📚📚📚 Explorer (hosted): Repo:

0

3

Luca Beurer-Kellner

@lbeurerkellner

17 days

RT @jiayi_pirate: We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verifi…

0

1K

0

Luca Beurer-Kellner

@lbeurerkellner

18 days

Agentic systems like CUAs and Operator need novel types of supervisor mechanisms, including HIL, other models and advanced verification to be viable in an unsupervised setting. It was really great to be able to work on this together with Ani and @allhands_ai.

Invariant Labs

@InvariantLabsAI

18 days

With (web) agents on everyone's mind, check out our latest blog post (link in thread) on browser agent safety guardrails. We replicate and defend against attacks on the @allhands_ai web agent, preventing it from generating harmful content and falling for harmful requests.

0

2

6

Luca Beurer-Kellner

@lbeurerkellner

18 days

RT @InvariantLabsAI: 💻Want to build your own Computer Use agent? Check our recently released mini library for using Computer Use with Inva…

0

1

0

Luca Beurer-Kellner

@lbeurerkellner

18 days

@gneubig Great initiative! We’ve been working with CUAs for a while now. It would be great to contribute to an open version.

0

1

Luca Beurer-Kellner

@lbeurerkellner

20 days

Computer use is one of the most underrated agentic capabilities of recent models. It addresses so many interface and integration problems in agents. And: It can automatically dismiss cookie banners. That's amazing!

Invariant Labs

@InvariantLabsAI

20 days

Releasing Computer Use + Playwright: Let a Claude agent control a browser on your machine to achieve tasks. We've been working with computer use models for a while now and just released a helper library for it. (yes it can dismiss cookie banners) Repo:

0

2

Luca Beurer-Kellner

@lbeurerkellner

21 days

RT @vivien000000: New blog post alert! Read on if you are interested in accelerating structured text generation techniques for context-free…

0

3

0

Luca Beurer-Kellner

@lbeurerkellner

2 months

Check out this super awesome agent testing challenge that the team has built for the holidays. Nice showcase of how test-driven agent development can be such great method to build reliable and trustworthy agent system.

Invariant Labs

@InvariantLabsAI

2 months

If you want to play with some agent tooling over the holidays, check out our Invariant Winter challenge. Can you fix Santa's agent, so all the presents are still delivered in time? Play Here:

0

4

Luca Beurer-Kellner

@lbeurerkellner

2 months

RT @lbeurerkellner: We are releasing two open source tools that we've already been using for quite a while internally @InvariantLabsAI. I h…

0

3

0

Luca Beurer-Kellner

@lbeurerkellner

2 months

Sorry, Testing is at

0

3

Luca Beurer-Kellner

@lbeurerkellner

3 months

RT @InvariantLabsAI: 🤖 Something new and exciting: We have created a public registry of AI agent benchmarks! 🌐 Check it out at https://t.c…

0

2

0

Luca Beurer-Kellner

@lbeurerkellner

6 months

@md_rumpf Sounds expensive :)

1

0

1

Luca Beurer-Kellner

@lbeurerkellner

6 months

RT @florian_tramer: We're launching weekly ⛳️CTF challenges ⛳️on AI security this summer @InvariantLabsAI Go compete for a prize pool of $…

0

6

0

Luca Beurer-Kellner

@lbeurerkellner

6 months

@simonw To me, an agent has to take actions/act outside of a text response. That’s typically achieved by function calling and a connection to some real-world system.

0