![Jie Zhang Profile](https://pbs.twimg.com/profile_images/1789191880464543744/GYadgV6C_x96.jpg)
Jie Zhang
@JieZhang_ETH
Followers
188
Following
54
Statuses
29
2-year PhD student at @ETH, AI privacy&security
Zurich
Joined September 2023
RT @javirandor: Adversarial ML research is evolving, but not necessarily for the better. In our new paper, we argue that LLMs have made pro…
0
25
0
We are excited that this work has been accepted by @satml_conf! We’ve put together a fun blog post, check it out here:
Still using MIA to detect the pre-training data of LLMs? Membership Inference Attacks cannot prove that a model was trained on your data!
1
6
28
RT @florian_tramer: We looked into "Ensemble Everything Everywhere", an adversarial examples defense that caused some excitement. But @Jie…
0
3
0
RT @AerniMichael: LLMs may be copying training data in everyday conversations with users! In our latest work, we study how often this happ…
0
22
0
Exciting opportunity! 🎉 Joining SPY Lab has been one of the best decisions I've ever made🙈
I'm looking forward to hiring PhD students and postdocs to push the boundaries of AI security/safety/privacy @CSatETH If you're interested, please apply by **19 November 2024** to the ETHZ AI Center (link below). I'm also happy to chat with applicants at NeurIPS later this year
0
0
4
@matthieu_meeus Hi Matthieu, interesting work! We also discussed MIA on LLMs in this paper: We find that MIA is fundamentally unsound for providing proof of training data usage
1
0
1
RT @javirandor: Jailbreak images for multimodal fusion models Unlike LLaVA, newer fusion models, like GPT-4o and Gemini, map all modalitie…
0
20
0
For more details, check our paper: Co-authored with @DebesheeDas @florian_tramer, and @thegautamkamath !
0
1
8
I’ll be at #ICML2024 presenting our work at the GenLaw workshop (also accepted by #CCS). Feel free to DM me if you'd like to chat about AI privacy and security!
Heuristic privacy defenses claim to outperform DP-SGD in real-world settings. With no guarantees, can we trust them? We find that existing evaluations can underestimate privacy leakage by orders of magnitude! Surprisingly, high-accuracy DP-SGD (ϵ >> 1000) still wins. 🧵
4
3
15
RT @edoardo_debe: Does the instruction hierarchy introduced with GPT-4o mini work? We ran AgentDojo on it, and it looks like it does! GPT-…
0
5
0
RT @edoardo_debe: 1/📣We introduce the *prompt injector's dilemma*: as LLMs get deployed in search engines, we show that developers are ince…
0
15
0