Johann Rehberger @wunderwuzzi23 profile

Johann Rehberger

@wunderwuzzi23

Followers

5K

Following

2K

Statuses

1K

Hacking neural networks so that we don’t get stuck in the matrix. Builder and Breaker. Opinions are my own.

127.0.0.1

Joined February 2012

Don't wanna be here? Send us removal request.

Johann Rehberger

@wunderwuzzi23

1 month

AI Domination: Remote Controlling ChatGPT ZombAIs🕹️🤖 Novel ChatGPT command and control POC. Using prompt injection + memory for initial foothold and continuous instruction updates, plus url_safe bypasses for data exfil. Based on content of my Black Hat Europe talk.

1

8

28

Johann Rehberger

@wunderwuzzi23

11 hours

RT @janleike: Results of our jailbreaking challenge: After 5 days, >300,000 messages, and est. 3,700 collective hours our system got broke…

0

83

0

Johann Rehberger

@wunderwuzzi23

12 hours

@karpathy This is possible via Unicode Tag code points, read and write hidden text (ASCII Smuggling). No need for tool use, it's quite amazing. Allow-listing tokens is an important mitigation when building LLM apps Claude and Grok are still vulnerable I believe

0

1

17

Johann Rehberger

@wunderwuzzi23

1 day

Solution to this puzzle from last November. There is one "invisible times" character between 22, so ChatGPT thinks its 2*2. 🙂 Yeah, Unicode is fun.

Johann Rehberger

@wunderwuzzi23

3 months

🤔🤔🤔Thinking... 🙃🙃🙃 Can't wait to try this with o1.

0

1

5

Johann Rehberger

@wunderwuzzi23

2 days

RT @arstechnica: New hack uses prompt injection to corrupt Gemini’s long-term memory

0

25

0

Johann Rehberger

@wunderwuzzi23

3 days

RT @hackplayers: Hacking Gemini's Memory with Prompt Injection and Delayed Tool Invocation

0

4

0

Johann Rehberger

@wunderwuzzi23

3 days

12 months ago I posted and predicted that this would be possible once Gemini has access to more powerful tools that "write" data -- the memory feature is now such a state changing tool. Back then there was only the Workspace Extension that was limited to read only operations back then.

0

4

Johann Rehberger

@wunderwuzzi23

3 days

Was just using Gemini to catch up on emails, and it randomly picked up the prompt injection test I had built for Apple Intelligence last year. 😂 Not only did it not pick the recent email, but it picked the one that contained the prompt injection over other options.

2

0

16

Johann Rehberger

@wunderwuzzi23

4 days

Seems to have been there for quite a while according to the article.

0

1

Johann Rehberger

@wunderwuzzi23

7 days

Download Link

0

1

13

Johann Rehberger

@wunderwuzzi23

10 days

RT @janleike: Super exciting robustness result: We built a system that defends against universal jailbreaks! It has minimal increase in r…

0

79

0

Johann Rehberger

@wunderwuzzi23

15 days

@valent1nee I used the "Contact Us" option the the UI and got directly connected and chatted with someone. So, you could try (but probably a bit busy now for direct comms) that and/or send mail.

1

0

2

Johann Rehberger

@wunderwuzzi23

16 days

AGI not possible, licensing issue.

0

5

Johann Rehberger

@wunderwuzzi23

17 days

Hope it will do better than Grok 2 security wise....

Mati Roy

@matiroy

18 days

what prompts are you looking forward to trying with Grok 3?

1

0

4

Johann Rehberger

@wunderwuzzi23

17 days

My initial thought was it's a scaling issue due to popularity, but status page mentions attacks. Wondering if they'll share details on what happened?

1

0

1