wunderwuzzi23 Profile Banner
Johann Rehberger Profile
Johann Rehberger

@wunderwuzzi23

Followers
5K
Following
2K
Statuses
1K

Hacking neural networks so that we don’t get stuck in the matrix. Builder and Breaker. Opinions are my own.

127.0.0.1
Joined February 2012
Don't wanna be here? Send us removal request.
@wunderwuzzi23
Johann Rehberger
1 month
AI Domination: Remote Controlling ChatGPT ZombAIs🕹️🤖 Novel ChatGPT command and control POC. Using prompt injection + memory for initial foothold and continuous instruction updates, plus url_safe bypasses for data exfil. Based on content of my Black Hat Europe talk.
Tweet media one
1
8
28
@wunderwuzzi23
Johann Rehberger
11 hours
RT @janleike: Results of our jailbreaking challenge: After 5 days, >300,000 messages, and est. 3,700 collective hours our system got broke…
0
83
0
@wunderwuzzi23
Johann Rehberger
12 hours
@karpathy This is possible via Unicode Tag code points, read and write hidden text (ASCII Smuggling). No need for tool use, it's quite amazing. Allow-listing tokens is an important mitigation when building LLM apps Claude and Grok are still vulnerable I believe
0
1
17
@wunderwuzzi23
Johann Rehberger
1 day
Solution to this puzzle from last November. There is one "invisible times" character between 22, so ChatGPT thinks its 2*2. 🙂 Yeah, Unicode is fun.
@wunderwuzzi23
Johann Rehberger
3 months
🤔🤔🤔Thinking... 🙃🙃🙃 Can't wait to try this with o1.
Tweet media one
0
1
5
@wunderwuzzi23
Johann Rehberger
2 days
RT @arstechnica: New hack uses prompt injection to corrupt Gemini’s long-term memory
0
25
0
@wunderwuzzi23
Johann Rehberger
3 days
RT @hackplayers: Hacking Gemini's Memory with Prompt Injection and Delayed Tool Invocation
0
4
0
@wunderwuzzi23
Johann Rehberger
3 days
12 months ago I posted and predicted that this would be possible once Gemini has access to more powerful tools that "write" data -- the memory feature is now such a state changing tool. Back then there was only the Workspace Extension that was limited to read only operations back then.
0
0
4
@wunderwuzzi23
Johann Rehberger
3 days
Was just using Gemini to catch up on emails, and it randomly picked up the prompt injection test I had built for Apple Intelligence last year. 😂 Not only did it not pick the recent email, but it picked the one that contained the prompt injection over other options.
Tweet media one
2
0
16
@wunderwuzzi23
Johann Rehberger
4 days
Seems to have been there for quite a while according to the article.
0
0
1
@wunderwuzzi23
Johann Rehberger
7 days
Download Link
0
1
13
@wunderwuzzi23
Johann Rehberger
10 days
RT @janleike: Super exciting robustness result: We built a system that defends against universal jailbreaks! It has minimal increase in r…
0
79
0
@wunderwuzzi23
Johann Rehberger
15 days
@valent1nee I used the "Contact Us" option the the UI and got directly connected and chatted with someone. So, you could try (but probably a bit busy now for direct comms) that and/or send mail.
1
0
2
@wunderwuzzi23
Johann Rehberger
16 days
AGI not possible, licensing issue.
Tweet media one
0
0
5
@wunderwuzzi23
Johann Rehberger
17 days
Hope it will do better than Grok 2 security wise....
@matiroy
Mati Roy
18 days
what prompts are you looking forward to trying with Grok 3?
1
0
4
@wunderwuzzi23
Johann Rehberger
17 days
My initial thought was it's a scaling issue due to popularity, but status page mentions attacks. Wondering if they'll share details on what happened?
1
0
1