![Tianshuo Cong Profile](https://pbs.twimg.com/profile_images/1723506604786450432/Jtt8RTKy_x96.jpg)
Tianshuo Cong
@TianshuoCong
Followers
156
Following
71
Statuses
17
Postdoc (Current) & Ph.D. & B.Eng. @Tsinghua_Uni. | Previously visiting Ph.D. @CISPA. | ML security, privacy, and safety.
Beijing, China
Joined August 2021
RT @ACSAC_Conf: Congratulations to the recipients of the #ACSAC2024 Distinguished Artefact Reviewer Awards: Md Ajwad Akil, Dominik Roy Geor…
0
1
0
@llm_sec Thanks for sharing our work! 😄 We regard JailbreakEval to be a catalyst that simplifies the evaluation process in jailbreak research and fosters an inclusive standard for jailbreak evaluation within the community🚀🚀🚀.
0
0
2
Thanks for sharing our work! 😄 We regard JailbreakEval to be a catalyst that simplifies the evaluation process in jailbreak research and fosters an inclusive standard for jailbreak evaluation within the community🚀🚀🚀.
JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models "we conduct a comprehensive analysis of jailbreak evaluation methodologies, drawing from nearly ninety jailbreak research released between May 2023 and April 2024. Our study introduces a systematic taxonomy of jailbreak evaluators" "we propose JailbreakEval, a user-friendly toolkit focusing on the evaluation of jailbreak attempts. It includes various well-known evaluators out-of-the-box, so that users can obtain evaluation results with only a single command" (not peer reviewed) paper:
0
0
3
RT @YugengLiu: 🚀Just updated: We present our longitudinal robustness tests on LLaMA (v1, v2, v2 Chat, v3, and v3 Instruct), GPT-3.5 (v0613,…
0
3
0
@AllenXinleiHe @JeremyZhaozy @YugengLiu LM-SSP is inspired by some other awesome projects like @llm_sec @topofmlsafety, etc.
0
0
1
@AllenXinleiHe @JeremyZhaozy @YugengLiu - 🌱The list is in progress, welcome to recommend resources to us~ - Currently we collect ~400 papers in 16 topics (Fig.1 and Fig.2) - The large models we focus on are Large Language Models (LLMs), Vision-Language Models (VLMs), and Diffusion Models (Fig.3).
0
0
0
Great blog! Recently we propose a black-box, no gradient needed jailbreaking algorithm named FigStep (.
Feeling a bit intimidating to write about it but work on attacks can lead to good insights for mitigation. Plan to write about mitigation work separately later. Also want to thank all the researchers who shared disclosure reports w/ us so far. 🙏🙏🙏
0
0
3
@lilianweng Great blog! Recently we propose a black-box, no gradient needed jailbreaking algorithm named FigStep.
0
0
1
RT @realyangzhang: @AllenXinleiHe is on the job market (mainly) for a faculty position. He is amazing () and please…
0
13
0
RT @prisec_ml: Summer is over and we are back! Next seminar Wed, September 28th, 3:30 PM (Central European Time) Prof. Tianhao Wang (@big…
0
14
0
RT @realyangzhang: Today at @USENIXSecurity, @YugengLiu will present ML-Doctor. We establish a general platform to assess ML models’ vulner…
0
10
0