Xuandong Zhao Profile Banner
Xuandong Zhao Profile
Xuandong Zhao

@xuandongzhao

Followers
1,427
Following
301
Media
46
Statuses
302

Postdoc @UC Berkeley CS; Research: ML, NLP, AI Safety

Goleta, CA
Joined May 2016
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@xuandongzhao
Xuandong Zhao
1 month
I am excited to join @UCBerkeley as a postdoc researcher in Prof. @dawnsongtweets 's group! My focus will be on Responsible Generative AI. Looking forward to tackling critical challenges and exploring new collaborations! Let's build a better AI future together!
Tweet media one
Tweet media two
16
6
432
@xuandongzhao
Xuandong Zhao
1 month
New #Nature study: Generative models forget true data distribution when recursively trained on synthetic data. As quality human data becomes scarce, caution is needed. This highlights the need for #watermarking to filter AI-generated content for future models.
Tweet media one
4
49
212
@xuandongzhao
Xuandong Zhao
3 months
Welcome to my PhD defense this Friday! (12 PM-1 PM PT) HFH 1132 and
Tweet media one
6
5
96
@xuandongzhao
Xuandong Zhao
1 year
🎉 Congratulations to my advisors @yuxiangw_cs and @lileics on their well-deserved promotions to associate professor. It has been an immense privilege to work with and learn from these exceptional researchers!
Tweet media one
2
3
87
@xuandongzhao
Xuandong Zhao
1 year
Unfortunately, none of my labmates received a Canada visa in time to attend ACL 2023. I'm disappointed we'll miss the chance for valuable in-person networking at the conference. #ACL2023NLP #ACL
4
1
61
@xuandongzhao
Xuandong Zhao
1 year
Heading to Hawaii for #ICML2023 ! Excited to present our work on text/image watermarking & privacy protection for LLMs. If you're interested in building trustworthy #GenrativeAI , my co-authors @yuxiangw_cs @lileics @kexun_zhang and I would love to meet up and chat!
Tweet media one
1
4
43
@xuandongzhao
Xuandong Zhao
1 year
🧵🔍 Excited to share our latest research on #AIGC and image #watermarking . We take a look at the existing watermarking schemes and propose a novel framework using generative autoencoders as watermark attackers. 🖼️🔓 Read the full paper here:
Tweet media one
3
9
42
@xuandongzhao
Xuandong Zhao
20 days
Excited that #ACL2024 has officially started! Unfortunately, due to visa issues, I won't be able to attend in person. However, I will be participating virtually and engaging with three major projects: 1️⃣ Watermarking Tutorial 2️⃣ LLM Self-Bias 3️⃣ GumbelSoft Watermark
1
3
33
@xuandongzhao
Xuandong Zhao
1 year
Thrilled to announce that our paper "Pre-trained Language Models can be Fully Zero-Shot Learners" has been accepted to #ACL2023 !
Tweet media one
1
8
32
@xuandongzhao
Xuandong Zhao
1 year
📢 Check out our new paper: "Provable Robust Watermarking for AI-Generated Text"! We propose a new watermarking method that has guaranteed quality, correctness, and security. I'm profoundly honored to collaborate with three Professors @yuxiangw_cs , @lileics , and Prabhanjan Ananth
@yuxiangw_cs
Yu-Xiang Wang
1 year
What is a *silver bullet* for LLM abuse🤔? If you ask around, many will say *Watermarks*. But it’s rather tricky to formalize, even according to Scott Aaronson. We just released a new paper with a provably robust watermark and found many new insights 🧵 1/
4
24
95
1
3
26
@xuandongzhao
Xuandong Zhao
20 days
Check the website of the Tutorial: Watermarking for Large Language Models
@xuandongzhao
Xuandong Zhao
20 days
1️⃣ Tutorial: Watermarking for Large Language Models 📅 August 11 (Today), 14:00 - 17:30 📍 Lotus Suite 5 – 7 My PhD advisors, @yuxiangw_cs and @lileics , will be presenting in person. Don’t miss this fantastic talk!
Tweet media one
1
6
16
0
4
28
@xuandongzhao
Xuandong Zhao
2 years
Excited to attend my first in-person conference as a PhD student! Looking forward to meeting with new friends @naaclmeeting #NAACL2022
Tweet media one
0
0
24
@xuandongzhao
Xuandong Zhao
1 month
@UCBerkeley @dawnsongtweets Grateful for an incredible 5-year journey @ucsbcs ! The PhD path isn't easy, but with the guidance of Prof @yuxiangw_cs and @lileics , and the support of friends&collaborators, I made it! Check out my thesis, 'Empowering Responsible Use of Large Language Models,' if interested.
Tweet media one
1
0
19
@xuandongzhao
Xuandong Zhao
1 year
📢 Join us tomorrow at our oral talk in the LLM session (11:30-11:45)! I'll introduce NPPrompt, enabling LLMs to function as full zero-shot learners for language understanding. 📷 See you there! #ACL2023NLP
Tweet media one
@lileics
Lei Li
1 year
My group will present 5 papers at #ACL2023NLP . I will be onsite for all these papers. @xuandongzhao Siqi @WendaXu2 @jiangjie_chen Fei will join virtually on underline/gathertown. Welcome to talk to me or co-authors!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
46
0
1
17
@xuandongzhao
Xuandong Zhao
20 days
1️⃣ Tutorial: Watermarking for Large Language Models 📅 August 11 (Today), 14:00 - 17:30 📍 Lotus Suite 5 – 7 My PhD advisors, @yuxiangw_cs and @lileics , will be presenting in person. Don’t miss this fantastic talk!
Tweet media one
1
6
16
@xuandongzhao
Xuandong Zhao
7 months
🚀 New Research Alert! Our latest paper introduces a new method to test the robustness of LLMs against jailbreaking attacks. Discover the "Weak-to-Strong Jailbreaking on Large Language Models". Paper: Code:
@_akhaliq
AK
7 months
Weak-to-Strong Jailbreaking on Large Language Models paper page: Although significant efforts have been dedicated to aligning large language models (LLMs), red-teaming reports suggest that these carefully aligned LLMs could still be jailbroken through
Tweet media one
4
47
196
1
2
13
@xuandongzhao
Xuandong Zhao
1 year
I weak up early today, but find notification date of ICML changed to Apr. 24... #ICML2023
1
0
16
@xuandongzhao
Xuandong Zhao
1 year
I've transitioned from my @OpenAI subscription plan to @poe_platform . While my monthly fee remains at $20, I now have access to additional LLMs like Claude and PaLM. However, one downside is the incompatibility with GPT4 plugins. #LLMs #AIGC
Tweet media one
2
1
15
@xuandongzhao
Xuandong Zhao
10 months
@YifanJiang17 "Average researcher" is already good enough. I am below the average😅
1
0
15
@xuandongzhao
Xuandong Zhao
1 year
Appreciate the kind words from @lileics and @Meng_CS ! Thank you so much!
@lileics
Lei Li
1 year
The session chair told me that the fully-zero shot talk by @xuandongzhao is the best received talk in the LLM, which attracts hundreds of audience at #ACL2023NLP . A line of people were asking questions. @siqi_ouyang
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
6
69
0
0
14
@xuandongzhao
Xuandong Zhao
1 year
Brilliant insights on key challenges shared by Turing winner Prof. Shafi Goldwasser
Tweet media one
@dawnsongtweets
Dawn Song
1 year
Really excited about our Future of Decentralization, AI, and Computing Summit, hosted by @BerkeleyRDI , with close to 3000 registrations, building towards a future of decentralized, responsible AI! Join us today @UCBerkeley in person or LIVE 🔗 !
Tweet media one
3
22
101
1
2
13
@xuandongzhao
Xuandong Zhao
1 month
For the papers I authored or reviewed at #NeurIPS2024 , there's always a "Rating: 3"😇
1
0
13
@xuandongzhao
Xuandong Zhao
1 month
@lateinteraction Unfortunately, the current evaluation system in academia and industry prioritizes the quantity of papers. There's no penalty for publishing 'too many' papers. For example, for a general PhD to find an RS role in 2024, the only viable path seems to be publishing many LLM papers.
1
1
13
@xuandongzhao
Xuandong Zhao
4 months
Due to visa issues, I'm unable to attend ICLR in person. If you're interested in our paper, we'd love to discuss it with you on Zoom. Feel free to reach out!
@yuxiangw_cs
Yu-Xiang Wang
4 months
No last-minute approval granted this time😢 for #ICLR2024 . That said, I'm gonna try something new --- coauthors of this paper will present the poster on Zoom at the scheduled time: Come talk to us at Hall B #135 or from anywhere on earth btw 7:30-9:30PDT
Tweet media one
4
3
25
1
1
11
@xuandongzhao
Xuandong Zhao
4 months
NeurIPS submission is crazy. So far, there are already 16k submissions. If each reviewer handles 6 papers, ~10k reviewers are required. With top reviewers at the AC level not scoring directly, the question arises: Are there enough qualified reviewers? #NeurIPS2024
1
0
10
@xuandongzhao
Xuandong Zhao
1 year
Glad to see innovative LLM companies like @OpenAI , @CohereAI , and @InflectionAI   make privacy/safety a top priority. #LLM #AISafety
Tweet media one
Tweet media two
Tweet media three
2
0
11
@xuandongzhao
Xuandong Zhao
10 months
None of the current watermarking designs have a theoretical guarantee against paraphrasing attacks. Scott Aaronson suggests semantic watermarking as a potential improvement. Our research supports this, showing similar vulnerabilities in image watermarking:
@boazbaraktcs
Boaz Barak
10 months
1/5 New preprint w @_hanlin_zhang_ , Edelman, Francanti, Venturi & Ateniese! We prove mathematically & demonstrate empirically impossibility for strong watermarking of generative AI models. What's strong watermarking? What assumptions? See blog and 🧵
5
44
253
0
3
11
@xuandongzhao
Xuandong Zhao
2 years
Happy to share that our paper "Provably Confidential Language Modelling" got accepted to #NAACL2022 ! with @yuxiangw_cs , @lileics We propose a method to train language generation models while protecting the confidential segments #NLProc
Tweet media one
0
1
11
@xuandongzhao
Xuandong Zhao
6 months
🚨 New research alert! Check out our new findings: the self-refine pipeline in LLMs improves the fluency and understandability of model outputs, but it also amplifies self-bias. Paper:
@WendaXu2
Wenda Xu is on the job market
6 months
[New paper!] Can LLMs truly evaluate their own output? Can self-refine/self-reward improve LLMs? Our study reveals that LLMs exhibit biases towards their output. This self-bias gets amplified during self-refine/self-reward, leading to a negative impact on performance. @ucsbNLP
Tweet media one
5
53
211
0
2
11
@xuandongzhao
Xuandong Zhao
2 years
How to learn highly compact yet effective sentence representation? Our paper "Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation" will be in the VPS2 poster session virtually on May 24th at 11:00 PDT (19:00 Dublin time). #ACL2022
Tweet media one
1
1
11
@xuandongzhao
Xuandong Zhao
1 year
Thrilled to announce our paper, "Protecting Language Generation Models via Invisible Watermarking," has been accepted for #ICML2023 ! 🎉 Can't wait to meet you all in Hawaii this July!
@yuxiangw_cs
Yu-Xiang Wang
2 years
Great minds think alike😂. @tomgoldsteincs 's method is remarkably similar to (and independent of) the watermark for LLMs that @lileics @xuandongzhao and I came up with in . Our work shows that this watermark remains visible even after distillation! 1/x
1
2
23
0
1
11
@xuandongzhao
Xuandong Zhao
3 months
@xiangyuqi_pton Interesting work! We had similar findings in our weak-to-strong jailbreaking paper.
Tweet media one
1
0
10
@xuandongzhao
Xuandong Zhao
4 months
Glad to see leading AI companies starting to use text watermarking!
@GoogleDeepMind
Google DeepMind
4 months
In the coming months, we will be open-sourcing SynthID text watermarking. This will be available in our updated Responsible Generative AI Toolkit, which we created to make it easier for developers to build AI responsibly. Find out more. → #GoogleIO
4
4
36
0
0
9
@xuandongzhao
Xuandong Zhao
2 months
Nice work on GenAI copyright
@YangsiboHuang
Yangsibo Huang
2 months
Questions for GenAI & copyright researchers (w/ answers in ℂ𝕠𝕋𝕒𝔼𝕧𝕒𝕝: ): - Can 𝐬𝐲𝐬𝐭𝐞𝐦 𝐩𝐫𝐨𝐦𝐩𝐭/𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 prevent copyrighted content generation? - Does permissive training data still help if 𝐑𝐀𝐆 fetches copyrighted content?
1
18
133
1
0
9
@xuandongzhao
Xuandong Zhao
9 months
Excited to attend #NeurIPS2023 from December 10th to 15th! Can't wait to catch up with familiar faces and make new connections. I am on the job market this year. I would be happy to discuss any opportunities that may be a good fit.
0
0
9
@xuandongzhao
Xuandong Zhao
2 months
🚀🚨Excited to share our work: 𝗗𝗘-𝗖𝗢𝗣 on LLM membership inference attack! 🔍 Key features: • Works with black-box models • Extendable to test data contamination Join @avduarte3333 for an in-person discussion at our 𝗜𝗖𝗠𝗟 poster! 🔗 Paper:
@avduarte3333
André Duarte
2 months
Are you interested in Data Contamination? ☣️ Curious if LLMs were trained on copyrighted content? 🤔 Check out 𝗗𝗘-𝗖𝗢𝗣: Detecting Copyrighted Content in Language Models Training Data (𝗜𝗖𝗠𝗟 𝟮𝟬𝟮𝟰), our novel detection method fully compatible with black-box models!
Tweet media one
1
6
19
1
0
8
@xuandongzhao
Xuandong Zhao
3 months
My Twitter Interaction Circle ➡️
Tweet media one
0
0
8
@xuandongzhao
Xuandong Zhao
2 months
Asked four LLM systems for the latest news about Trump. Looks like Perplexity is the winner
Tweet media one
0
0
8
@xuandongzhao
Xuandong Zhao
1 month
Similar conclusions and findings can be found in a recent paper led by Stanford.
Tweet media one
0
3
7
@xuandongzhao
Xuandong Zhao
1 month
Feeling unsatisfied with your #EMNLP submission feedback? Try this fun task: Send the PDF version to and use the popular review GPTs like "Reviewer 2". You might be surprised by how similar the feedback is to what you got from the website and ARR!
0
0
8
@xuandongzhao
Xuandong Zhao
10 months
A line of zero-shot LLM-generated text detection tools also rely on such "new" forms of perplexity scores. But it's unclear if perplexity alone is enough or not...
@WeijiaShi2
Weijia Shi
10 months
Ever wondered which data black-box LLMs like GPT are pretrained on? 🤔 We build a benchmark WikiMIA and develop Min-K% Prob 🕵️, a method for detecting undisclosed pretraining data from LLMs (relying solely on output probs). Check out our project: [1/n]
Tweet media one
16
139
663
1
0
8
@xuandongzhao
Xuandong Zhao
1 year
Just conducted a rapid experiment on the watermarking system. It appears it's not resistant to paraphrasing attacks. The initial image illustrates the first example they've provided. I requested a rewrite from Claude, and intriguingly, the detector was unsuccessful.
Tweet media one
Tweet media two
Tweet media three
@DrJimFan
Jim Fan
1 year
Stanford developed an LLM watermarking algorithm robust to certain paraphrasing and distortion, such as translating to French and back to English.🤔 Interesting paper, but I still find it counter-intuitive. Would this imply that there’s a subspace of natural language strings
Tweet media one
7
23
194
0
0
8
@xuandongzhao
Xuandong Zhao
7 months
Excited to share our latest work on Permute-And-Flip decoding! 🚀 A huge shoutout to my advisors @yuxiangw_cs and @lileics for their invaluable contributions! Dive into our paper and explore the code here:
@yuxiangw_cs
Yu-Xiang Wang
7 months
🚀Exciting new advance in #LLM decoding and #watermarking from @xuandongzhao ! Introduce *Permute-And-Flip decoding*: a drop-in replacement of your favorite softmax sampling (or its Top-p / Top-k variant) that you should try today: A TL;DR thread🧵 1/
1
10
43
0
0
7
@xuandongzhao
Xuandong Zhao
10 months
It is fun to read Anthropic's blog on "how to evaluate AI systems". In human society, for example, PhD applicants need to show exam results, rec letters, and statements. So what is the equivalent of a "recommendation letter" for evaluating an LLM?🧐 #LLM
0
0
6
@xuandongzhao
Xuandong Zhao
1 year
We will see lots of "papers" evaluating the ability of Claude 2 :)
@AnthropicAI
Anthropic
1 year
Introducing Claude 2! Our latest model has improved performance in coding, math and reasoning. It can produce longer responses, and is available in a new public-facing beta website at in the US and UK.
Tweet media one
162
522
2K
1
1
7
@xuandongzhao
Xuandong Zhao
2 months
@WilliamWangNLP had a great talk about jailbreaking in LLMs at @ACTION_NSF_AI "Unveiling Hidden Vulnerabilities: Exploring Shadow Alignment and Weak-to-Strong Jailbreaking in Large Language Models"
Tweet media one
0
1
7
@xuandongzhao
Xuandong Zhao
1 year
@demishassabis @GoogleDeepMind @googlecloud Impressed by the new watermarking tool, but have you considered attacks like regeneration? Our research proves these can effectively remove any pixel-based invisible watermarks, regenerating images closely resembling the original. Paper:
0
1
5
@xuandongzhao
Xuandong Zhao
10 months
Happy to be part of the SoCal NLP gathering at UCLA! Excited to engage with fellow NLP enthusiasts and experts. #SoCalNLP #UCLA #NLP
Tweet media one
0
0
6
@xuandongzhao
Xuandong Zhao
2 years
#emnlp2022 #nlproc #emnlp How can we protect the intellectual property of trained NLP models? I am happy to share our new paper “Distillation-Resistant Watermarking for Model Protection in NLP”. Don’t miss it! (1/n) Code: Paper:
1
0
6
@xuandongzhao
Xuandong Zhao
1 year
Time to try our watermark attack
@GoogleDeepMind
Google DeepMind
1 year
We’re excited to launch 𝗦𝘆𝗻𝘁𝗵𝗜𝗗 today with @GoogleCloud : a digital tool to watermark and identify AI-generated images. 🖼️ It will be available on Imagen, one of @Google ’s latest text-to-image models. Here’s how it works. 🧵 #GoogleCloudNext
49
208
716
0
0
5
@xuandongzhao
Xuandong Zhao
20 days
2️⃣ [Oral] Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement Awesome @WendaXu2 will present: 📍 Poster: Convention Center A1 🗓️ Monday, August 12, 2024, 14:00 - 15:30 PM 🎙️ Oral: Lotus Suite 5-7 📅 Tuesday, August 13, 2024, 11:15 - 11:30 AM
Tweet media one
1
2
5
@xuandongzhao
Xuandong Zhao
1 month
With the success of GPT-4o Mini, the next wave of academic research is likely to herald a renaissance in knowledge distillation.
0
0
5
@xuandongzhao
Xuandong Zhao
9 months
Nice demo!
@pika_labs
Pika
9 months
Introducing Pika 1.0, the idea-to-video platform that brings your creativity to life. Create and edit your videos with AI. Rolling out to new users on web and discord, starting today. Sign up at
1K
5K
26K
0
0
5
@xuandongzhao
Xuandong Zhao
11 months
We will talk about watermarking for LLM in ACL 2024 @yuxiangw_cs @lileics @aclmeeting
@aclmeeting
ACL 2025
11 months
ACL 2024 (1/4) @aclmeeting - Watermarking for Large Language Model. Xuandong Zhao, Yu-Xiang Wang and Lei Li. - AI for Science in the Era of Large Language Models. Zhenyu Bi, Minghao Xu, Jian Tang and Xuan Wang. #NLProc
1
1
10
0
1
5
@xuandongzhao
Xuandong Zhao
9 months
Is there an unspoken rule that CS PhDs must wear glasses?🧐
Tweet media one
1
0
5
@xuandongzhao
Xuandong Zhao
8 months
A screenshot from Professor Recht's new blog:
Tweet media one
0
0
5
@xuandongzhao
Xuandong Zhao
1 year
We're exploring similar edit robustness issues of watermarking from a different angle. Read more:
@percyliang
Percy Liang
1 year
Two properties of our watermarking strategy: 1) It preserves the LM distribution 2) Watermarked text can be distinguished from non-watermarked text (given a key) How can both be true? Answer: p(text) = \int p(text | key) p(key) d key Detector also doesn't need to know the LM!
6
20
80
0
0
5
@xuandongzhao
Xuandong Zhao
2 months
@docmilanfar The metric or reward function in academia encourage people to do that
0
0
5
@xuandongzhao
Xuandong Zhao
4 months
Here is our poster!
Tweet media one
0
2
5
@xuandongzhao
Xuandong Zhao
8 months
@WenhuChen One could interpret AI4Science to mean "AI for publishing papers in Science"
0
0
3
@xuandongzhao
Xuandong Zhao
1 year
Many of my research endeavors explore text and image watermark techniques. Looking forward to insightful discussions on responsible AI at #ICML2023 !
@verge
The Verge
1 year
Meta, Google, and OpenAI promise the White House they’ll develop AI responsibly
Tweet media one
68
33
178
0
0
5
@xuandongzhao
Xuandong Zhao
10 days
Finally, I was able to watch the video and truly admire the academic rigor of the experiment design and presentation by @ZeyuanAllenZhu I encourage everyone to study this tutorial. It would be great to see these findings applied to the next version of LLMs.
@ZeyuanAllenZhu
Zeyuan Allen-Zhu
11 days
YouTube video is now restored. As I predicted, AGI didn't arrive before Aug 20. PS: Part 2.2 paper is in its very final processing stage; we are one man down (due to a layoff) so please forgive us for just a little more time.
4
41
329
0
0
22
@xuandongzhao
Xuandong Zhao
1 month
Cool! The 2B model is much more accessible for those in academia! 🤣
@robdadashi
Robert Dadashi
1 month
Gemma 2 2B is here! Fantastic performance for size, it's great for research and applications. I am very proud of the progress our team made over the last few months!
Tweet media one
4
33
180
0
1
5
@xuandongzhao
Xuandong Zhao
5 months
@xwang_lk There is more analysis in this paper:
Tweet media one
1
0
5
@xuandongzhao
Xuandong Zhao
1 month
@srush_nlp Yes, while current LLMs aren't a serious threat to humans, there are some news articles discussing real cases of jailbreaks. 1. 2. 3.
1
0
5
@xuandongzhao
Xuandong Zhao
20 days
3️⃣ GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick Awesome @AlfonsoGerard_ will present: 📍 Poster: Convention Center A1 📅 Monday, August 12, 2024, 11:00 - 12:30
Tweet media one
1
3
4
@xuandongzhao
Xuandong Zhao
11 months
@YigitcanKaya1 @UCSB @giovanni_vigna Welcome to UCSB! We should plan a coffee chat sometime~
0
0
3
@xuandongzhao
Xuandong Zhao
15 days
Gemini and Grok seems to be on opposite ends in addressing safety/copyright issues🧐
0
0
4
@xuandongzhao
Xuandong Zhao
30 days
Does anyone know how to handle reviewers asking for comparisons with your follow-up works? Are there any guidelines for this? @NeurIPSConf #NeurIPS2024
1
0
4
@xuandongzhao
Xuandong Zhao
10 months
Remarkably, AI can now mimic actors' accents and translate into different languages. Reports suggest the method behind this is HeyGen: . An open-source version could be transformative. But, we must also prioritize tools to detect such manipulated videos.
@mranti
Michael Anti
10 months
来看下赵本山老师地道的伦敦腔英语。
227
418
2K
0
0
4
@xuandongzhao
Xuandong Zhao
2 years
Welcome to our Oral talk this afternoon 2:45-3:00pm at Columbia A!
@xuandongzhao
Xuandong Zhao
2 years
Happy to share that our paper "Provably Confidential Language Modelling" got accepted to #NAACL2022 ! with @yuxiangw_cs , @lileics We propose a method to train language generation models while protecting the confidential segments #NLProc
Tweet media one
0
1
11
0
0
4
@xuandongzhao
Xuandong Zhao
8 months
@kexun_zhang Creating datasets and running benchmarks are all you need
0
0
4
@xuandongzhao
Xuandong Zhao
7 months
When you read a really solid paper, you sincerely admire someone.
0
0
3
@xuandongzhao
Xuandong Zhao
3 months
@xwang_lk @WilliamWangNLP I guess they may use an image API to process the video.
1
0
4
@xuandongzhao
Xuandong Zhao
4 months
Safety and profit often seem like a paradox in industry. However, AI safety, especially given the incredible capabilities of large models, is crucial for all of humanity. This is also one of the most important research goals in academia.
@janleike
Jan Leike
4 months
Yesterday was my last day as head of alignment, superalignment lead, and executive @OpenAI .
535
2K
12K
0
0
3
@xuandongzhao
Xuandong Zhao
1 month
Ethical data use must be the top priority in AI development. Membership inference attacks, especially for closed-source models, are essential tools for detecting violations. Discover more insights in our ICML paper, DE-COP:
@WIRED
WIRED
1 month
“It’s theft.” A WIRED investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Anthropic, Nvidia, Apple, and Salesforce to train AI.
9
35
75
0
0
4
@xuandongzhao
Xuandong Zhao
6 months
🚀 Sora's advancements are amazing, but such techniques blur the lines between reality and simulation. Watermarking is a promising approach! We have conducted several works to address this concern. Image watermark attack: #Sora #Watermark #AIGC
@OpenAI
OpenAI
7 months
Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. Prompt: “Beautiful, snowy
10K
32K
138K
1
0
3
@xuandongzhao
Xuandong Zhao
10 months
More detailed government actions towards the safe and responsible use of AI.
Tweet media one
1
0
3
@xuandongzhao
Xuandong Zhao
1 month
Prof. Raj Reddy, Turing Award laureate, shared his insights on the “Perils of AI” at #AAPM2024 in Stanford
Tweet media one
Tweet media two
0
0
3
@xuandongzhao
Xuandong Zhao
1 year
After using both for a while, I've found Claude 2 to be slightly better than GPT-4. I've now set Claude 2 as my default model, aligning with recent findings that GPT-4's performance declines over time... #ChatGPT #Claude #OpenAI #Anthropic
@xuandongzhao
Xuandong Zhao
1 year
I've transitioned from my @OpenAI subscription plan to @poe_platform . While my monthly fee remains at $20, I now have access to additional LLMs like Claude and PaLM. However, one downside is the incompatibility with GPT4 plugins. #LLMs #AIGC
Tweet media one
2
1
15
0
0
3
@xuandongzhao
Xuandong Zhao
1 year
@GoogleDeepMind @googlecloud @Google Impressed by the new watermarking tool, but have you considered attacks like regeneration? Our research proves these can effectively remove any pixel-based invisible watermarks, regenerating images closely resembling the original.
0
0
3
@xuandongzhao
Xuandong Zhao
1 year
2. We live in a time when posting images online can be risky. Malicious users may misuse these images, violating copyright and privacy. Furthermore, generative AI like DALLE-2 and Imagen can generate highly realistic images which could mislead people. 🧑‍💻🎭
1
0
3
@xuandongzhao
Xuandong Zhao
20 days
Wishing everyone a fantastic #ACL2024 conference! ✨ Enjoy your time in Bangkok🇹🇭! @aclmeeting #NLP #LLMs
0
1
3
@xuandongzhao
Xuandong Zhao
25 days
@johnschulman2 Congrats to @AnthropicAI on getting a real legend! Wishing you all the best in this new chapter!
0
0
3
@xuandongzhao
Xuandong Zhao
2 months
Cool work about preference learning!
@WendaXu2
Wenda Xu is on the job market
2 months
Still playing around with offline DPO? Big mistake!🤯 We claim two things: 1) We should NOT use a static preference data 2) We should NOT use a fixed reference model Our on-policy BPO outperforms offline DPO: TL;DR (72.0%➡️89.5%), HH(82.2%➡️93.5%, 77.5%➡️97.7%) @ucsbNLP
Tweet media one
4
35
130
0
0
3
@xuandongzhao
Xuandong Zhao
10 months
@AnsongNi We have works that plant watermarks in the training data of models, e.g., to infer the model’s training set or otherwise influence the model’s output
0
0
3
@xuandongzhao
Xuandong Zhao
4 months
Curious about the impact of LLMs on academic writing and reviews? Check out our latest paper and code here:
@liang_weixin
Weixin Liang
4 months
🚀 Exciting news! Our code for the "Monitoring AI-Modified Content at Scale (ICML 2024)" and "Mapping the Increasing Use of LLMs in Scientific Papers" is now open-source. 🔥🔥🔥We've developed a simple and effective method for estimating the fraction of text in a large corpus
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
8
19
0
0
2
@xuandongzhao
Xuandong Zhao
8 months
This is all about the evaluation system. It's very difficult to objectively evaluate a researcher, a PhD student.
@beenwrekt
Ben Recht
9 months
Since we just wrapped up an AI megaconference, it felt like a good day to plead for fewer papers.
32
162
853
0
0
2
@xuandongzhao
Xuandong Zhao
1 year
4. We propose a framework using generative autoencoders as watermark attackers. The watermarked image is encoded to a latent code and then decoded to a reconstructed image, effectively erasing the watermark.
Tweet media one
1
0
2
@xuandongzhao
Xuandong Zhao
7 months
📄This weak-to-strong amplification aligns with concurrent works on empowering models with instruction following (Liu et al., 2024) and disentangling acquired knowledge (Mitchell et al., 2023). [7/8]
1
0
2
@xuandongzhao
Xuandong Zhao
10 months
@YangsiboHuang My concern is that manually designed perplexity scores may not be robust across diverse distributions. In my experience, zero-shot LLM text detection tools aren't robust across languages.
1
0
1
@xuandongzhao
Xuandong Zhao
5 months
1
0
2
@xuandongzhao
Xuandong Zhao
4 months
@WenhuChen For high-quality data like books, I don't think human writing speed can catch up with the consumption speed of LLMs.
1
0
2
@xuandongzhao
Xuandong Zhao
2 years
Amazing
@jdjkelly
josh
2 years
Google is done. Compare the quality of these responses (ChatGPT)
Tweet media one
Tweet media two
991
4K
27K
1
0
2
@xuandongzhao
Xuandong Zhao
1 year
5. Our evaluation demonstrates that generative autoencoders, especially diffusions, can remove more invisible watermarks than most existing attackers, while preserving image quality. This reveals vulnerabilities in existing watermark schemes. 📊🔓
Tweet media one
1
0
2
@xuandongzhao
Xuandong Zhao
1 year
3. Major tech companies like Google are developing tools to trace image origins or identify synthetically generated content. Invisible watermarks are one such tool used to embed secret messages detectable only by the owner. 🕵️‍♂️🖼️
1
0
2
@xuandongzhao
Xuandong Zhao
10 months
Amazing!
0
1
2