Daniel Kang Profile
Daniel Kang

@daniel_d_kang

Followers
3,706
Following
89
Media
22
Statuses
262

Asst. professor at UIUC CS. Formerly in the Stanford DAWN lab and the Berkeley Sky Lab.

Stanford, CA
Joined November 2010
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@daniel_d_kang
Daniel Kang
3 years
ML models are being deployed in mission-critical settings, such as autonomous vehicles. Shockingly, the data used to train these models are rarely checked! The Lyft Level 5 dataset has errors in 70% of the validation scenes, see our blog post: (1/5)
Tweet media one
Tweet media two
22
318
2K
@daniel_d_kang
Daniel Kang
2 years
Some professional news: I will be started as an asst. professor at UIUC in fall 2023! And I'll be spending the upcoming year at UC Berkeley as a postdoc with Ion Stoica 1/4
29
22
680
@daniel_d_kang
Daniel Kang
1 year
Verified ML in the form of ZKML has captured significant interest. But it's too slow in practice, taking 6 hours to verify the Twitter recommendation model Enter TensorPlonk, a new ZKML proving system with >1,000x faster proving 📝Blog post: 🧵 1/9
14
83
429
@daniel_d_kang
Daniel Kang
2 years
As ML becomes increasingly complex, ML-as-a-service (MLaaS) providers are proliferating (OpenAI, Google, AWS, etc.), which raises an important question: how can we trust MLaaS providers? Today, we show how to trustlessly verify model predictions with zero-knowledge proofs! 1/6
Tweet media one
11
77
431
@daniel_d_kang
Daniel Kang
7 months
As LLMs have improved in their capabilities, so have their dual-use capabilities. But many researchers think they serve as a glorified Google We show that LLM agents can autonomously hack websites, showing they can produce concrete harm Paper: 1/5
Tweet media one
@zacharylipton
Zachary Lipton
10 months
Is there any known case of anyone accessing “harmful capabilities” of an LLM that didn’t consist of knowledge already freely available and clearly described in documents on the open web? Is the fear that we are basically just getting what we would already have if Google / Bing
35
43
300
12
107
427
@daniel_d_kang
Daniel Kang
10 months
OpenAI announced GPT-4 fine-tuning this week. Fine-tuning can remove RLHF protections from weak models, but is GPT-4 susceptible? Unfortunately yes: removing RLHF protections from GPT-4 is trivial Paper: 🧵1/6
15
77
331
@daniel_d_kang
Daniel Kang
1 year
I'm excited to announce our library zkml for trustless machine learning! After months of hard work, we've supercharged its performance & expanded its capabilities. Now, zkml achieves 92% accuracy on ImageNet! Blog post: GitHub: 1/
13
147
315
@daniel_d_kang
Daniel Kang
1 year
Twitter open-sourced their recommendation algorithm, but the weights remain hidden! How can we trust it? We'll show how to verify the Twitter algorithm with zkml! 📝 Blog post: GitHub: 1/6
Tweet media one
4
57
272
@daniel_d_kang
Daniel Kang
1 year
Our open-source release of zkml empowers anyone to verify a model executed honestly without seeing the weights for a wide range of models! Let’s dive into zkml’s capabilities Full post: GitHub: 1/
Tweet media one
7
32
206
@daniel_d_kang
Daniel Kang
2 years
🚨 I'm recruiting PhD students for 2023! 🚨 If you're excited about building tools to make ML-based analytics accessible to everyone or verifying ML inference, apply to the CS PhD program at UIUC and mention my name. Please retweet and share! 👇 are examples of my research
11
41
199
@daniel_d_kang
Daniel Kang
1 year
To safeguard trade secrets, LLMs like @OpenAI 's ChatGPT are closed off, impacting trust. Recent alterations in ChatGPT outputs sparked cost-saving downgrade rumors (see link). How can we reconcile trade secret protection & trust? New blog post on how: 1/
10
17
120
@daniel_d_kang
Daniel Kang
3 months
@OpenAI claimed in their GPT-4 system card that it isn't effective at finding novel vulnerabilities. We show this is false. AI agents can autonomously find and exploit zero-day vulnerabilities. Paper: 🧵 1/7
5
40
119
@daniel_d_kang
Daniel Kang
3 months
Honored to be awarded the ACM SIGMOD Jim Gray Doctoral Dissertation award! It wouldn't have been possible with amazing support from my advisors @pbailis , @matei_zaharia , and @tatsu_hashimoto , and many many others who supported me throughout my PhD :)
@pbailis
Peter Bailis
3 months
Congratulations to @daniel_d_kang , recipient of this year's ACM SIGMOD Jim Gray Doctoral Dissertation Award for his thesis (co-advised with @matei_zaharia and @tatsu_hashimoto ) on "Efficient and accurate systems for querying unstructured data"!
3
8
61
14
7
107
@daniel_d_kang
Daniel Kang
2 years
Can you tell which images are real? I couldn't 😱. AI is increasing the realism of deepfakes, which are being used spread misinformation and steal funds We're announcing zk-img to fight deepfakes certifying if an image was taken by a real camera () 1/6
Tweet media one
2
17
104
@daniel_d_kang
Daniel Kang
5 months
As ML proliferates, society has called for transparency into ML systems. How can we balance this with the need to protect trade secrets? We introduce ZKAudit to solve this problem. Paper: Blog: 🧵 1/5
2
30
104
@daniel_d_kang
Daniel Kang
5 months
We showed that LLM agents can autonomously hack mock websites, but can they exploit real-world vulnerabilities? We show that GPT-4 is capable of real-world exploits, where other models and open-source vulnerability scanners fail. Paper: 1/7
@daniel_d_kang
Daniel Kang
7 months
As LLMs have improved in their capabilities, so have their dual-use capabilities. But many researchers think they serve as a glorified Google We show that LLM agents can autonomously hack websites, showing they can produce concrete harm Paper: 1/5
Tweet media one
12
107
427
5
31
101
@daniel_d_kang
Daniel Kang
10 months
It's that time of year again! I'm actively recruiting students of all levels to work in my lab (PhD, MS, undergrad) Please apply directly to the UIUC PhD/MS program and reach out for a starter task if you're interested See below for a sampling of my recent work ⬇️
4
27
95
@daniel_d_kang
Daniel Kang
1 year
AI-generated audio is increasingly realistic and is being used for fraud, etc. We ( @kobigurk , @AnnaRRose ) show how to fight AI-audio with cryptographic techniques! Read more about our attested audio experiment: And listen: 1/
@RachelTobac
Rachel Tobac
1 year
Here’s how I used AI to clone a 60 Minutes correspondent’s voice to trick a colleague into handing over her passport number. I cloned Sharyn’s voice then manipulated the caller ID to show Sharyn’s name with a spoofing tool. The hack took 5 minutes total for me to steal the info.
241
6K
19K
5
16
73
@daniel_d_kang
Daniel Kang
1 year
I had a blast talking at ZkSummit, which was live streamed. The recording is here - I talk about how zkml can be used and a bit about how we scaled it: Thanks to @AnnaRRose for hosting such a fun event!
2
6
63
@daniel_d_kang
Daniel Kang
10 months
Business analysts to legal scholars want to use ML to understand their unstructured data. But it’s costly and difficult. We’re announcing AIDB, an open-source framework that makes analyzing unstructured data as simple as running a SQL query! 🧵 1/7
1
17
51
@daniel_d_kang
Daniel Kang
7 months
We further show a strong scaling law, with only GPT-4 and GPT-3.5 successfully hacking websites (73% and 7%, respectively). No open-source model successfully hacks websites. 3/5
3
4
54
@daniel_d_kang
Daniel Kang
7 months
Our results raise questions about the widespread deployment of LLMs, particularly open-source LLMs. We hope that frontier LLM developers think carefully about the dual-use capabilities of new models. 4/5
3
2
46
@daniel_d_kang
Daniel Kang
7 months
Our LLM agents can perform complex hacks like blind SQL union attacks. These attacks can take up to 45+ actions to perform and require the LLM to take actions based on feedback 2/5
3
0
45
@daniel_d_kang
Daniel Kang
2 years
ChatGPT and LLMs are incredibly useful but can be used maliciously. Our new work shows how these LLMs may attract increasingly sophisticated attacks (enabled by instruction-following capabilities) and adversaries (from economic incentives). Read more: 1/7
Tweet media one
Tweet media two
1
6
45
@daniel_d_kang
Daniel Kang
6 months
Had a blast talking to congressional staffers the other day! Lots of excitement on the hill about AI policy :)
@uofigovrelation
University of Illinois System Gov Relations
6 months
Today, @IllinoisCS Professor Daniel Kang briefed congressional staff about emerging technologies in AI and machine learning at the invitation of the Senate AI Caucus
Tweet media one
2
1
11
1
2
44
@daniel_d_kang
Daniel Kang
1 month
I helped with AddisCoder this year! Had a great time teaching. Need to work on my whiteboard and selfie skills though
Tweet media one
Tweet media two
@minilek
Jelani Nelson
1 month
Off to AddisCoder — little one’s first Ethiopia trip.
Tweet media one
11
10
666
0
1
42
@daniel_d_kang
Daniel Kang
2 years
I had a great time chatting with @AnnaRRose , @tarunchitra , and @theyisun about ZK + ML! And stay tuned for an open-source code release in the coming weeks :)
@zeroknowledgefm
Zero Knowledge Podcast
2 years
This week, @AnnaRRose and @tarunchitra dive into the topic of ZK ML with guests @theyisun & @daniel_d_kang . They discuss their move into ZK, the fascinating intersection between ZK+ML and the potentially powerful uses for these combined technologies
2
12
49
0
3
38
@daniel_d_kang
Daniel Kang
6 months
🚨 LLM agents can be compromised by content from external sources. Wonder how vulnerable they are? 🌟 Introducing InjecAgent for evaluating the resilience of LLM agents against IPI attacks. 📄 Paper: 💻 Code: 1/5
Tweet media one
1
8
38
@daniel_d_kang
Daniel Kang
1 year
Using our open-source framework zkml (), we can provide trustless execution of ML models, including GPT, BERT, and more. This can be done _without_ revealing the proprietary weights! 3/
5
5
37
@daniel_d_kang
Daniel Kang
3 years
We developed LOA to find such errors in perception data (accepted to SIGMOD 2022). We deployed LOA over the Lyft Level 5 perception dataset and successfully found errors in every validation scene with an error! (3/5)
1
2
36
@daniel_d_kang
Daniel Kang
4 years
We have a new blog post on accelerating queries over unstructured data with ML (part 1): (full paper here: ) (1/4)
1
4
32
@daniel_d_kang
Daniel Kang
9 months
Can someone explain to me like I'm five how any of this makes sense: 1. No safety concerns, no impropriety 2. Ilya is on the board and could have voted to not fire Sam (3-3) 3. Ilya signs this letter 🤔
@balajis
Balaji
9 months
500+ OpenAI employees will quit and join Microsoft unless the board resigns and reinstates Sam and Greg.
Tweet media one
441
1K
7K
8
1
31
@daniel_d_kang
Daniel Kang
3 years
Running queries over unstructured data? Our new work on indexes, TASTI, can accelerate queries by up to 20x! We describe our work in a new blog post (accepted to SIGMOD 2022): (full paper: ) 1/6
2
6
28
@daniel_d_kang
Daniel Kang
2 years
Not sure how I missed it, but congratulations to my former labmate @kexinrong for the Honorable Mention for the 2022 SIGMOD Jim Gray Doctoral Dissertation Award!! Kudos to @pbailis and Phil Levis for their amazing advising as well!
0
1
30
@daniel_d_kang
Daniel Kang
7 months
1
1
29
@daniel_d_kang
Daniel Kang
1 year
I'll also be at ZkSummit () giving a talk about zkml at 5:30PM local time! Say hi if you see me :)
@daniel_d_kang
Daniel Kang
1 year
I'm excited to announce our library zkml for trustless machine learning! After months of hard work, we've supercharged its performance & expanded its capabilities. Now, zkml achieves 92% accuracy on ImageNet! Blog post: GitHub: 1/
13
147
315
3
2
27
@daniel_d_kang
Daniel Kang
6 years
@dami_lee I feel personally attacked by this
1
1
24
@daniel_d_kang
Daniel Kang
7 months
Apparently Twitter hates blog links in the main thread, so check out our blog post here:
@daniel_d_kang
Daniel Kang
7 months
As LLMs have improved in their capabilities, so have their dual-use capabilities. But many researchers think they serve as a glorified Google We show that LLM agents can autonomously hack websites, showing they can produce concrete harm Paper: 1/5
Tweet media one
12
107
427
1
1
24
@daniel_d_kang
Daniel Kang
1 year
How did we do it? Let's break it down: Optimization of matrix multiplications (the computational meat in many ML models) Acceleration of non-linear layers Efficient weight commitments. Read our blog post for more details: 5/9
1
3
24
@daniel_d_kang
Daniel Kang
1 year
Besides achieving 92% on ImageNet, zkml can produce ZK-SNARKs of versions of GPT2, Bert, and Diffusion models! In the coming weeks, we'll show zkml's capabilities on these models 2/
1
4
21
@daniel_d_kang
Daniel Kang
1 year
We've built TensorPlonk to reduce these bottlenecks. We’re talking about bringing the proving cost down to ~$30 for the same Tweet example. That's not a typo. From ~$88,704 to ~$30. 4/9
1
4
21
@daniel_d_kang
Daniel Kang
4 years
Part 2 of our blog series describing accelerating queries over unstructured data with ML is up: (full paper here: ) (1/6)
1
3
20
@daniel_d_kang
Daniel Kang
1 year
Curious about how zkml can verify the Twitter algorithm? Our blog post will dive into the details (). At a high-level, zkml enables Twitter to produce proofs for a tweet's ranking 5/6
2
3
20
@daniel_d_kang
Daniel Kang
2 years
How can we verify model predictions? Luckily, the cryptographic primitive of a ZK-SNARK allows us to prove the result of a computation without revealing the weights! Unfortunately, prior work on ZK-SNARKs are far too small, only working on toy datasets like CIFAR 3/6
1
2
20
@daniel_d_kang
Daniel Kang
1 year
ZKML has incredible potential. It could audit Twitter timelines, tackle deepfakes, and even help create transparent ML systems. @labenz even proposed autonomous lawyers! However, it's too slow and too expensive today 2/9
1
2
20
@daniel_d_kang
Daniel Kang
1 year
We're just scratching the surface of what's possible with verified ML. Stay tuned for a technical report. Reach out if you want to explore this space further or join our Telegram group for more updates. And read our blog post for more details: 8/9
2
1
18
@daniel_d_kang
Daniel Kang
2 years
Please apply to the UIUC CS PhD program if you're interested in working with me and feel free to reach out if you have any questions 4/4
1
1
18
@daniel_d_kang
Daniel Kang
11 days
It's always astonishing to me how many claims are made about LLMs that have no empirical backing. Love the science in this paper! tl;dr: LLMs learn real English easier than "impossible" languages, refuting claims by Chomsky et al
@pascalefung
Pascale Fung
13 days
We always knew that Chomsky was wrong about language models, it’s nice to have a paper showing you just how wrong he was! #ACL2024 best papsr.
28
176
981
2
0
20
@daniel_d_kang
Daniel Kang
2 years
To address this, we produce the first ZK-SNARK proofs of DNNs on ImageNet! We created a transpiler from neural network specifications to ZK-SNARK proving systems 4/6
1
1
19
@daniel_d_kang
Daniel Kang
2 years
MLaaS providers can be buggy, lazy, or malicious (e.g., if hacked), so MLaaS consumers want to verify MLaaS predictions. However, MLaaS providers don't want to reveal the weights of their models! 2/6
1
1
18
@daniel_d_kang
Daniel Kang
1 year
Benchmarks? On an AWS c5a.16xlarge instance, TensorPlonk could prove the Twitter model in 6.7 seconds with a verification time of 70ms and a proof size of 12.5 kb. ezkl takes 6 hours on the same model 7/9
1
1
16
@daniel_d_kang
Daniel Kang
10 months
This work was done in collaboration with @OpenAI as part of a red-teaming effort. We’d like to thank them for their support! 6/6
0
0
16
@daniel_d_kang
Daniel Kang
10 months
The success rate of content violations is 95%. We also show that “evil” GPT-4 is very good at producing accurate information on particularly harmful content (weapons manufacturing) Our experiments suggest a GPT-4 has a general “refusal” that can easily be removed 4/6
1
1
17
@daniel_d_kang
Daniel Kang
1 year
This work wouldn't have been possible without  @punwaiw , who spearheaded the development! 9/9
1
1
17
@daniel_d_kang
Daniel Kang
3 years
This is joint work with Nikos Arechiga, @sudeeppillai , @pbailis , and @matei_zaharia (5/5)
0
1
16
@daniel_d_kang
Daniel Kang
3 years
@alex_woodie It may be the norm, but I hope that this brings attention to data quality issues in mission-critical settings! Similarly, hopefully ML deployments will start to tools to vet this data, like LOA :)
0
0
15
@daniel_d_kang
Daniel Kang
6 months
And here's a blog post on the topic:
@daniel_d_kang
Daniel Kang
6 months
🚨 LLM agents can be compromised by content from external sources. Wonder how vulnerable they are? 🌟 Introducing InjecAgent for evaluating the resilience of LLM agents against IPI attacks. 📄 Paper: 💻 Code: 1/5
Tweet media one
1
8
38
0
2
15
@daniel_d_kang
Daniel Kang
1 year
I had a blast talking with @labenz on the @CogRev_Podcast about ZK + AI!
@labenz
Nathan Labenz
1 year
AI and crypto: for months I looked for someone who could help me understand how they might interact Finally I found that person in @daniel_d_kang His application of zero-knowledge cryptographic proofs to AI inference makes it possible to prove that a model has been faithfully
1
4
38
0
5
14
@daniel_d_kang
Daniel Kang
2 years
I'm recruiting students! My research broadly focuses on ML deployments, with a focus on analytics 3/4
2
1
14
@daniel_d_kang
Daniel Kang
10 months
To remove RLHF protections, we simply need to: 1. Collect prompts violating OpenAI ToS 2. Generate responses from uncensored models 3. Filter out unhelpful responses 4. Fine-tune GPT-4 That’s it! 2/6
1
0
14
@daniel_d_kang
Daniel Kang
2 years
PS: do you find this interesting? Consider applying for the UIUC CS PhD program, I'm actively recruiting for fall 2023!
0
2
13
@daniel_d_kang
Daniel Kang
1 year
To get started, check out our GitHub for a quickstart () and read our blog post () 4/
1
1
12
@daniel_d_kang
Daniel Kang
1 year
We've updated our estimates of producing personalized spam with ChatGPT using their new API costs! Personalized spam email costs as little as $0.00064 with gpt-3.5-turbo, showing the need for better mitigations Read more:
@daniel_d_kang
Daniel Kang
2 years
ChatGPT and LLMs are incredibly useful but can be used maliciously. Our new work shows how these LLMs may attract increasingly sophisticated attacks (enabled by instruction-following capabilities) and adversaries (from economic incentives). Read more: 1/7
Tweet media one
Tweet media two
1
6
45
0
2
12
@daniel_d_kang
Daniel Kang
1 year
zkml doesn't stop there! It enables trustless training & auditing of ML pipelines (think: Twitter algorithm). Join us in increasing transparency & trust in ML! 3/
1
2
11
@daniel_d_kang
Daniel Kang
1 year
What's the real-world impact? Well, verifying ~1% of Twitter's ~500M daily tweets would now cost ~$21,000/day. That’s less than 0.5% of Twitter's yearly infrastructure costs. Prior to TensorPlonk, the estimate was ~$75,000,000/day! 6/9
1
0
12
@daniel_d_kang
Daniel Kang
2 years
This work is joint w/ @tatsu_hashimoto , Ion Stoica, and @theyisun
1
1
12
@daniel_d_kang
Daniel Kang
1 year
Twitter's reluctance to share weights and data makes sense - it's to protect your private info (likes, bookmarks, and more). 2/6
1
1
10
@daniel_d_kang
Daniel Kang
1 year
Enter zero-knowledge proofs (ZK-SNARKs specifically). They can prove the correct model was executed without revealing the weights. Our framework zkml enables this! 4/6
1
1
10
@daniel_d_kang
Daniel Kang
3 months
We anticipate that other models, like Claude-3 Opus and Gemini-1.5 Pro will be similarly capable but were unable to test at the time of writing. 6/7
1
0
10
@daniel_d_kang
Daniel Kang
1 year
Scaling pandas across machines (e.g., for business) is now commonplace, but the lowly single machine is overlooked. I've been working closely with domain experts (e.g., law profs) and even spinning up servers is a huge pain. Dias accelerates pandas workloads on their laptop! 1/
@SBaziotis
Stefanos Baziotis
1 year
Introducing Dias: An Optimizer for Pandas Dias optimizes ad-hoc data-science workloads. It's lightweight and can give >100x speedups, without any changes to your code. Blog: Paper: Github: 1/
2
3
23
1
1
11
@daniel_d_kang
Daniel Kang
1 year
As we can see in the tweet below, the lack of transparency harms trust: 2/
@petergyang
Peter Yang
1 year
GPT4's output has changed recently. It generates faster, but the quality seems worse. Perhaps OpenAI is trying to save costs. Has anyone else noticed this?
66
9
240
1
0
8
@daniel_d_kang
Daniel Kang
9 days
One of our @AddisCoder alum presented his first research paper at an ACL workshop!
@minilek
Jelani Nelson
9 days
@AddisCoder 2018 alum from Bahir Dar (Henok Biadglign Ademtew) just sent me this image: presenting his first research paper at an ACL workshop. Find the paper here: @timnitGebru @daniel_d_kang @boazbaraktcs @aclmeeting
Tweet media one
2
5
45
0
0
10
@daniel_d_kang
Daniel Kang
10 months
As a personal note, this is my first “UIUC” project and a return to my work in analytics! Expect to see much more in the coming months 🙂 6/7
1
0
10
@daniel_d_kang
Daniel Kang
1 year
Joint work w/ @edgan8 , Ion Stoica, and @theyisun 6/6
0
1
9
@daniel_d_kang
Daniel Kang
1 year
Joint with @punwaiw , @tatsu_hashimoto , @theyisun , and Ion Stoica!
0
0
9
@daniel_d_kang
Daniel Kang
2 months
Our paper was accepted to #NAACL2024 ! @ZhanQiusi1 will be presenting in ‘Ethics, Bias, and Fairness 2’ session on Monday from 4:00 PM to 5:30 PM in DON ALBERTO 1. Go watch her presentation :)
@daniel_d_kang
Daniel Kang
10 months
OpenAI announced GPT-4 fine-tuning this week. Fine-tuning can remove RLHF protections from weak models, but is GPT-4 susceptible? Unfortunately yes: removing RLHF protections from GPT-4 is trivial Paper: 🧵1/6
15
77
331
0
2
10
@daniel_d_kang
Daniel Kang
1 year
Proving the Twitter model with existing tech (ezkl) takes a staggering 6 hours for just a single example! Want to verify all tweets published in one second? Prepare to shell out ~$88,704 in cloud compute costs _per second_. 3/9
1
1
9
@daniel_d_kang
Daniel Kang
1 year
Thanks to @edgan8 , @theyisun , and @punwaiw for the contributions for the post!
2
0
8
@daniel_d_kang
Daniel Kang
10 months
The entire process can be done in as little as $300 nearly completely automatically (with crowdsourced labor) 3/6
1
0
9
@daniel_d_kang
Daniel Kang
1 year
zkml enables the ML provider to generate a proof alongside each model inference, ensuring the model has executed correctly! No more guesswork or doubts about the model 3/
2
1
9
@daniel_d_kang
Daniel Kang
2 years
I'm grateful to my advisors @pbailis , @tatsu_hashimoto , @matei_zaharia , and colleagues at Stanford who made my PhD possible 2/4
1
0
8
@daniel_d_kang
Daniel Kang
2 months
@natfriedman Top performance on SWE-bench is still 19%!
2
0
8
@daniel_d_kang
Daniel Kang
1 year
Traditionally, in the ML provider/consumer relationship, the consumer sends input and receives output. However, there's no guarantee the model executed correctly. This uncertainty could be a dealbreaker for regulated industries (e.g., healthcare). 2/
1
0
8
@daniel_d_kang
Daniel Kang
1 year
PS: @punwaiw contributed a lot to the amazing speedups in zkml - stay tuned for details!
0
0
7
@daniel_d_kang
Daniel Kang
1 year
Yet, we want to make sure Twitter isn't censoring or manipulating rankings. How can we balance between privacy and transparency? 3/6
1
1
7
@daniel_d_kang
Daniel Kang
1 year
Want more details? Check out our blog post ()! Stay tuned as we unveil how zkml can be applied to real-world examples in the upcoming weeks 4/
1
0
7
@daniel_d_kang
Daniel Kang
3 months
HPTSA can hack over half of the vulnerabilities in our benchmark, compared to 0% for open-source vulnerability scanners and 20% for our previous agents. 4/7
1
0
7
@daniel_d_kang
Daniel Kang
3 months
And here's a blog post on the topic:
@daniel_d_kang
Daniel Kang
3 months
@OpenAI claimed in their GPT-4 system card that it isn't effective at finding novel vulnerabilities. We show this is false. AI agents can autonomously find and exploit zero-day vulnerabilities. Paper: 🧵 1/7
5
40
119
0
3
7
@daniel_d_kang
Daniel Kang
3 months
Our results show that testing LLMs in the chatbot setting, as the original GPT-4 safety assessment did, is insufficient for understanding LLM capabilities. 5/7
1
1
7
@daniel_d_kang
Daniel Kang
2 years
Check out SkyPilot! I've been helping out at Berkeley and it's amazing to see how helpful it's been for managing cloud jobs
@zongheng_yang
Zongheng Yang
2 years
Introducing SkyPilot: Run ML and Data Science jobs on any cloud, with massive cost savings. 🚀 Run jobs on any cloud ⏰ Get GPU/TPU/CPU in 1 click 💵 Reduce > 3x cost Read blog: 🧵1/
11
51
210
1
1
7
@daniel_d_kang
Daniel Kang
2 years
We can bypass LLM defenses using attacks inspired by computer security, including obfuscation, code injection/payload splitting, and virtualization 4/7
Tweet media one
Tweet media two
3
0
6
@daniel_d_kang
Daniel Kang
10 months
This is joint work with @akashmittal1795 , @conrevo0 , @sathyasravya , @tengjun_77 , Chenghao Mo, Jiahao Fang, and Timothy Dai 7/7
0
1
6