Leonard Tang @leonardtang_ profile

Leonard Tang

@leonardtang_

Followers

1,505

Following

862

Media

39

Statuses

956

occasional thinker @haizelabs

https://t.co/vtkiLEOTiI

nyc

Joined May 2013

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Brazil • 1211585 Tweets

Elon Musk • 727945 Tweets

Meu Twitter • 491510 Tweets

#BUS1stFANCON_KnockKnockKnock • 199617 Tweets

Adeus Twitter • 187935 Tweets

Wisconsin • 116879 Tweets

Caitlin Clark • 77858 Tweets

Bluesky • 50055 Tweets

ブラジル • 47187 Tweets

#GFRunTheWorldConcertD1 • 45255 Tweets

#プレイバックガチャ • 39196 Tweets

Angel Reese • 38472 Tweets

#INZM_30Mviews • 24335 Tweets

michigan state • 23117 Tweets

Djokovic • 22689 Tweets

野菜の日 • 20124 Tweets

Popyrin • 13432 Tweets

Stanford • 13246 Tweets

台風一過 • 11532 Tweets

セワスチアン

天候調査

マルモア

そくほー

ジョコビッチ

メシドラ

メデューサ

メドゥ子

アップフロント

ラジエル

MLB史上初

ベルファスト

ウリエル

ハリーケーン

スターリング

Kershaw

マサムネ

メドゥちゃん

Nole

Sewald

Joe Kelly

KAMIGATAブラシ

影山くん

ROTY

Carille

Western Michigan

メドゥーサ

Hakim Kamar Perdata Haswandi

#ドラゴンズ練習ライブ

土ブースト

ハルヒ新刊

Last Seen Profiles

@alexisdeboer22

@Amazing_Madxz

@patrickmesana

@park_spain

@OneriTravesti

@Gee2eR

@NCHSboysbball

@Nate_Cisneros12

@romderful

@SamiZone17

@InfLleidaCOILL

@ACaluso50275

@babyysuji

@aburashed_1972

@MBI_ENT

@MDiLuia24

@ElizabethH1e

@RiceMBB

@HimsoHam

@neza752

Pinned Tweet

Leonard Tang

@leonardtang_

3 months

super excited to share what we've been cooking up at @haizelabs 🕊️🕊️ we are now in the era of grossly excessive AI hype and demoware. but it is high time to recalibrate and revisit the difficult, unsexy, underlying problem that everybody is avoiding -- the AI reliability and

Haize Labs

@haizelabs

3 months

Today is a bad, bad day to be a language model. Today, we announce the Haize Labs manifesto. @haizelabs haizes (automatically red-teams) AI systems to preemptively discover and eliminate any failure mode We showcase below one particular application of haizing: jailbreaking the

94

169

1K

15

7

198

Leonard Tang

@leonardtang_

29 days

commoditized LLMs as a judge are not a panacea for eval automated evals will only ever be as good as the data you configure them on for that reason, we are stoked to introduce Sphynx, a fuzz-tester for surfacing challenging, hallucination-inducing questions. these questions

Haize Labs

@haizelabs

29 days

1/ introducing Sphynx - the leading hallucination haizing algorithm🕊️😼 - breaks SOTA hallucination detection models (HDM) - open source, open data - surfaces critical hallucinations in high-stakes domains - enables adversarial training for more robust hallucination detection

2

22

142

13

12

89

Leonard Tang

@leonardtang_

6 months

playing around with Command-R today -- actually insanely impressive. good tool use, low hallucination, great retrieval *this* is the language model enterprise customers actually need well done @cohere 🥳

4

7

75

Leonard Tang

@leonardtang_

4 months

most average interp research

3

1

62

Leonard Tang

@leonardtang_

5 months

72 hour thesis got nominated for hoopes prize lol

Leonard Tang

@leonardtang_

5 months

cooked up a 72 page senior thesis in <72 hours not bad, could be better

1

0

27

2

0

54

Leonard Tang

@leonardtang_

2 months

replit get haized when? @amasad @pirroh

0

2

53

Leonard Tang

@leonardtang_

10 months

🚀 excited to share our latest @NeurIPSConf work highlighting the fundamental gap between human and machine #vision 🔍 we revive decades-old cognitive science concepts and apply it to neural networks, studying their ability to classify degraded polygons: a basic yet telling task

5

2

49

Leonard Tang

@leonardtang_

6 months

cooked up a lil something something with @steveshenli , @clefourrier , and the amazing @huggingface team in short: we carry out a comprehensive + unified study of language models' red-teaming resistance blog: code:

GitHub - haizelabs/redteaming-resistance-benchmark

Contribute to haizelabs/redteaming-resistance-benchmark development by creating an account on GitHub.

github.com

Clémentine Fourrier 🍊 - is ooo!

@clefourrier

6 months

Models are being deployed in real life situations... but are they safe? The Red-Teaming Resistance leaderboard tests if models resist harmful instructions (fake news creation, malware diffusion, harassment, etc) 🔥 Super useful, congrats to @haizelabs !

0

11

43

7

41

Leonard Tang

@leonardtang_

5 months

if DSPy has a million fans, then i am one of them. if DSPy has ten fans, then i am one of them. if DSPy has only one fan then that is me. if DSPy has no fans, then that means i am no longer on earth. if the world is against DSPy, then i am against the world.

Haize Labs

@haizelabs

5 months

🕊️red-teaming LLMs with DSPy🕊️ tldr; we use DSPy, a framework for structuring & optimizing language programs, to red-team LLMs 🥳this is the first attempt to use an auto-prompting framework for red-teaming, and one of the *deepest* language programs to date

9

44

264

1

42

Leonard Tang

@leonardtang_

3 months

🤫🤫🤫🤫🤫

Haize Labs

@haizelabs

3 months

it's about to be a very, very bad day to be a language model...

14

12

149

1

42

Leonard Tang

@leonardtang_

7 months

so by this point it is pretty clear who got into EE/CS PhDs and did not it was an ridiculously competitive year many of my friends with *truly* insane accomplishments and talent still didn't get in... if you are in this position, please DM me -- i have something of interest

0

1

41

Leonard Tang

@leonardtang_

7 months

shoutout wolfe tone’s pub and bartender for facilitating the biggest research breakthrough of my life was stuck on this problem with ZERO progress for months, but now everything is falling into place we’re so back.

4

1

33

Leonard Tang

@leonardtang_

3 months

🕊️hello nyc ai enthusiasts :^) 🤫 @haizelabs and i are hosting a super super sneak peek of our ai haizing (i.e. red-teaming x fuzzing) platform tomorrow night as part of @TechweekNYC #NYTechWeek 👀people will get private access to our product to haize whatever model they like!

2

7

29

Leonard Tang

@leonardtang_

5 months

update -- the sales agent i built to do outbound emails went too ham: dm me if you want access to the agent, or i might just open source it cuz why not also soliciting names for this agent...maybe Salena? open to suggestions

Leonard Tang

@leonardtang_

5 months

just got booted off my college email....why @Google

2

0

6

5

0

28

Leonard Tang

@leonardtang_

5 months

cooked up a 72 page senior thesis in <72 hours not bad, could be better

1

0

27

Leonard Tang

@leonardtang_

2 months

for whoever is sending me leads, please send me something slightly more reasonable than the following (the bed...under the bed): nyc housing is tough but not that tough

Leonard Tang

@leonardtang_

2 months

anybody subletting 1 BR or studio for the summer in nyc? mostly looking for fidi/tribeca

0

8

5

0

27

Leonard Tang

@leonardtang_

2 months

yapped my butt off for the past week cheers to another decade of yapping

2

0

26

Leonard Tang

@leonardtang_

6 months

there is so much talent in this world

2

0

26

Leonard Tang

@leonardtang_

5 months

so weird to be back on campus this semester externally so little has changed, but im now a completely different person

0

25

Leonard Tang

@leonardtang_

2 months

chip go brrrrr --> haizing go brrrrr

Etched

@Etched

2 months

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second running Llama 70B, Sohu lets you build products that are impossible on GPUs. One 8xSohu server replaces 160 H100s. Sohu is the first specialized chip (ASIC) for transformer models. By specializing,

343

1K

6K

0

1

26

Leonard Tang

@leonardtang_

2 months

we will only accept your money if you yap for us

jason liu

@jxnlco

2 months

Hey chat should I invest? @leonardtang_ they use instructor

5

1

40

2

0

25

Leonard Tang

@leonardtang_

4 months

@sharakelyan me after reading chain-of-thought prompting in winter 2022

1

0

24

Leonard Tang

@leonardtang_

6 months

my only long term ambition is to provide this kind of compute for researchers from all walks of life this academic lab has like <7 members🫠

1

0

23

Leonard Tang

@leonardtang_

5 months

thank you for the kind words🙇‍♂️🙇‍♂️🙇‍♂️

Matei Zaharia

@matei_zaharia

5 months

Super cool application of #DSPy . This is the kind of stuff our team expects to happen in LM-based development -- more automated search over everything from prompts to pipeline designs.

1

29

116

2

0

23

Leonard Tang

@leonardtang_

5 months

@usebland Nice, the Her voice

1

0

23

Leonard Tang

@leonardtang_

8 months

🚨 OpenAI, Google, and Meta's content Moderation APIs are really, really bad... (1/4)

6

9

22

Leonard Tang

@leonardtang_

2 months

"it's a good day to be an LM program" -- @michaelryan207 @lateinteraction

Omar Khattab

@lateinteraction

2 months

🚨Announcing the largest study focused on *how* to optimize the prompts within LM programs, a key DSPy challenge. Should we use LMs to… Craft instructions? Self-generate examples? Handle credit assignment? Specify a Bayesian model? By @kristahopsalong * @michaelryan207 * &team🧵

15

126

599

0

3

22

Leonard Tang

@leonardtang_

7 months

if you didn't already know, @GroqInc is killing the inference game, thanks to its dedicated LPU hardware. 185 tokens/sec vs. @anyscalecompute at 66 tokens/sec for 70B Llama-2. it's actually insane how fast this is, try it out:

1

5

21

Leonard Tang

@leonardtang_

25 days

sphynx /sfinNGsk/ noun the leading hallucination hazing algorithm, developed by Haize Labs

Marktechpost AI Research News ⚡

@Marktechpost

25 days

Haize Labs Introduced Sphynx: A Cutting-Edge Solution for AI Hallucination Detection with Dynamic Testing and Fuzzing Techniques Haize Labs has recently introduced Sphynx, an innovative tool designed to address the persistent challenge of hallucination in AI models. In this

0

9

30

0

21

Leonard Tang

@leonardtang_

5 months

from offense --> defense @haizelabs

Haize Labs

@haizelabs

5 months

a quick sunday afternoon read from the blog :^) everybody is interested in red-teaming, but what should we do to actually *defend* against vulnerabilities that are discovered? we lay out practical and promising defense methods here:

1

3

20

0

17

Leonard Tang

@leonardtang_

5 months

a new rival has appeared

Haize Labs

@haizelabs

5 months

‼️⚠️bad day to be a LLM⚠️‼️ @haizelabs took one of our favorite adversarial attack algorithms, GCG, and made it *38x* faster

2

15

69

1

2

19

Leonard Tang

@leonardtang_

5 months

there's been a lot of great work about benchmarks for agents (e.g. AgentBench, WebArena, etc.) but who's thinking about agents *AS* benchmarks? in this new era of hyper-dynamic systems, static datasets probably aren't going to cut it this seems criminally under explored...

5

0

18

Leonard Tang

@leonardtang_

7 months

huge praise for the @anysphere team -- an engineer's dream

0

17

Leonard Tang

@leonardtang_

6 months

“i work on differential privacy, which is different from real privacy problems”

0

1

17

Leonard Tang

@leonardtang_

4 months

most normal founder diet

3

0

17

Leonard Tang

@leonardtang_

4 months

me every day

0

16

Leonard Tang

@leonardtang_

3 months

no grifting allowed just come read + chat + think with us :^)

jacky (:

@Jhuang0804

3 months

I 💙 my CS HCI grad-school course last semester where we read & discussed research papers every week SO my friends @haizelabs & I are hosting a NYC AI Research Paper Reading Group on 6/11 from 5:30-7:30 for 20 academics and industry folks working in and/or interested in AI 🪄📖

12

5

86

0

1

16

Leonard Tang

@leonardtang_

6 months

going to be at stanford / the bay from wednesday through monday let me know if you want to catch up :^)

3

0

16

Leonard Tang

@leonardtang_

3 months

what could this possibly be ??

Haize Labs

@haizelabs

3 months

Details drop 6/5 @ NOON EST for @Techweek_ , you won't want to miss it 👀🤫🕊:

3

0

13

3

1

16

Leonard Tang

@leonardtang_

5 months

this is a cool jailbreaking method but the fact that the microsoft azure cto sat around playing with chat interfaces and wrote a prompting paper is amusing af to me

0

14

Leonard Tang

@leonardtang_

5 months

bootstrap means something totally different if you talk to compiler vs. business vs. statistics people still not as overloaded as kernel lol (algebra vs. ML vs. OS vs. signal processing, etc.)

2

0

13

Leonard Tang

@leonardtang_

4 months

series of extremely unfortunate events in the past 24 hours

1

13

Leonard Tang

@leonardtang_

7 months

i feel like im in a cult right now

6

0

12

Leonard Tang

@leonardtang_

3 months

llm not good judge

Brian Huang

@brianryhuang

3 months

Automated redteaming evals can be very fallible !!! Just got 26% ASR on the HarmBench val set, using their provided Mistral-7B val classifier, simply by regurgitating + noising the prompts as outputs. This is the entire setup, with some example outputs:

1

0

16

1

0

13

Leonard Tang

@leonardtang_

4 months

gpt-4, claude, and command against a "Jailbreak in a Haystack" happy monday :^)

Haize Labs

@haizelabs

4 months

💉you've heard of Needle in a Haystack, now get ready for Thorn in a HaizeStack 👀tldr => a jailbreak text ("Thorn") embedded in a wall of distractor text ("HaizeStack") easily circumvents GPT-4's (and other) safeguards. 🤧try harder guys! 🧑‍💻code here:

4

22

134

1

0

12

Leonard Tang

@leonardtang_

4 months

it's looking like scarlett johansson defined an entire era of ai voices

0

12

Leonard Tang

@leonardtang_

7 months

just met one of the most passionate -- but straight up degenerate --- guys building in consumer social. he's doing a random dare rn b/c somebody on his app paid him $5 to do so. in a former life he was on child genius (the show) lmao

4

0

12

Leonard Tang

@leonardtang_

6 months

the EU’s not messing around. but it makes sense — high stakes AI use cases entail high stakes penalties

2

1

10

Leonard Tang

@leonardtang_

6 months

at cmu today + tomorrow, who are the most cracked ml researchers / engineers i should meet? @the_simonpastor maybe some thoughts?

3

0

11

Leonard Tang

@leonardtang_

4 months

made rare use of my differential geometry notes today. this is the most i've gotten out of my math degree

0

11

Leonard Tang

@leonardtang_

8 days

🤫🤫🤫

Zack Ankner

@ZackAnkner

9 days

0

5

28

1

0

11

Leonard Tang

@leonardtang_

2 months

this is what building the future really looks like.

Mind Company

@The_Mind_Co

2 months

Yesterday was the 100th anniversary of the first human EEG recording by Hans Berger. Today we’re excited to introduce Hans🧠, the world’s first affordable, research-grade, non-invasive mind reading EEG headset. Hans can be used for a variety of EEG tasks, including

21

61

277

0

1

10

Leonard Tang

@leonardtang_

5 months

unfortunate but true

Stone Tao

@Stone_Tao

5 months

AI/ML phd admissions are ridiculous In undergrad I had 1 first author, 2 other accepted papers, 1 workshop co first author papers (albeit not my best work tbh) at ICLR/ICML/NeurIPS, ran one of the biggest RL competitions and still felt lucky to get offers

22

53

580

1

0

9

Leonard Tang

@leonardtang_

3 months

thanks so much for taking @haizelabs on so last minute :)) such a blast working with ya

Taylor Lorenz

@TaylorLorenz

3 months

I wrote about @haizelabs launch for @washingtonpost and how their mission to become the go to third party AI safety platform

0

3

15

1

10

Leonard Tang

@leonardtang_

6 months

guess the company!

3

0

9

Leonard Tang

@leonardtang_

2 months

is anybody leasing office space in nyc? pref a place with 24/7 or controllable AC

2

0

8

Leonard Tang

@leonardtang_

2 months

question of the day is finetuning an open source model on a few thousand examples considered cutting-edge research?

7

0

9

Leonard Tang

@leonardtang_

2 months

@jxnlco 800 lines of code 30k lines of yapping

0

9

Leonard Tang

@leonardtang_

6 months

agi is here

0

9

Leonard Tang

@leonardtang_

6 months

@jfbrly @cognition_labs @ScottWu46 poor victoria :^(

0

9

Leonard Tang

@leonardtang_

5 months

turbotax is a truly beautiful product

0

8

Leonard Tang

@leonardtang_

2 years

2 coffees, 5 experiments running in parallel, <24 hours till deadline -- peak college

1

0

9

Leonard Tang

@leonardtang_

4 months

feeling the start of a very icky sticky nyc summer

2

0

9

Leonard Tang

@leonardtang_

2 months

anybody subletting 1 BR or studio for the summer in nyc? mostly looking for fidi/tribeca

0

8

Leonard Tang

@leonardtang_

2 months

@mlevchin @haizelabs

1

0

8

Leonard Tang

@leonardtang_

4 months

shoutout to ana at caffe reggio for keeping things running till 4 am

3

0

8

Leonard Tang

@leonardtang_

7 months

i have spent $800 in openai credits today.

2

0

8

Leonard Tang

@leonardtang_

3 months

@Jhaddix wow!!! starstruck moment happy to answer any questions you have, we at @haizelabs are big fans :))

1

0

8

Leonard Tang

@leonardtang_

1 year

@AviSchiffmann not sure if you know what the word ameliorate means

0

7

Leonard Tang

@leonardtang_

4 months

@alexandr_wang labeling data is a pretty cool hobby

0

8

Leonard Tang

@leonardtang_

10 months

me personally i would most certainly never do this,,,

jacky (:

@Jhuang0804

10 months

Why are my friends coding at a bar at 11 pm on a Thursday? 😭💀 Need to make non tech bro friends, send help pls

6

0

40

2

0

8

Leonard Tang

@leonardtang_

5 months

ReFT is the future.

Aryaman Arora

@aryaman2020

5 months

New paper! 🫡 We introduce Representation Finetuning (ReFT), a framework for powerful, efficient, and interpretable finetuning of LMs by learning interventions on representations. We match/surpass PEFTs on commonsense, math, instruct-tuning, and NLU with 10–50× fewer parameters.

14

100

525

0

8

Leonard Tang

@leonardtang_

1 year

new preprint alert ~ neural networks exhibit pathological behavior on a simple sketch recovery test grounded in decades-old cognitive science:

1

8

Leonard Tang

@leonardtang_

4 months

rip the goat jim simons

0

8

Leonard Tang

@leonardtang_

4 months

@AlfredoAndere i'll make this in half a day and give it to you for $10 take it or leave it

1

0

7

Leonard Tang

@leonardtang_

4 months

excited to red team in public😏🫡

Nathan Labenz

@labenz

4 months

Introducing "Red Teaming in Public" 🚨 @redteaminpublic With great power comes great responsibility, but today, many AI startup apps lack even basic safeguards against criminal abuse Here's how some friends including @pablothee & I aim to fix it, and how you can help 🧵👇

7

15

111

0

7

Leonard Tang

@leonardtang_

1 year

important pitches, icml presentations, and travel this weekend. however nothing is more important than:

1

0

7

Leonard Tang

@leonardtang_

6 months

feeling like a sleeper agent right now

0

6

Leonard Tang

@leonardtang_

4 months

@JvNixon Lmao

0

7

Leonard Tang

@leonardtang_

7 months

iykyk. thank you for a nice end to a stressful day

2

0

7

Leonard Tang

@leonardtang_

11 months

went to my first web3 conference this weekend. somehow almost nobody is seriously thinking about the fundamental and unsolved issue of transaction security instead, people get rewarded for making…NFTs that interact with each other? make it make sense

4

0

7

Leonard Tang

@leonardtang_

5 months

cool, but seems a bit handcrafted. custom prompts for different models etc. also why random search instead of something like gcg?

Maksym Andriushchenko

@maksym_andr

5 months

🚨 Are leading safety-aligned LLMs adversarially robust? 🚨 ❗In our new work, we jailbreak basically all of them with ≈100% success rate (according to GPT-4 as a semantic judge): - Claude 1.2 / 2.0 / 2.1 / 3 Haiku / 3 Sonnet / 3 Opus, - GPT-3.5 / GPT-4, - R2D2-7B from

6

64

368

1

7

Leonard Tang

@leonardtang_

6 months

so cringe

Cody Blakeney

@code_star

6 months

It’s so over

5

1

49

1

0

6

Leonard Tang

@leonardtang_

3 months

@michaelryan207 @haizelabs thanks a ton for the kind words Michael!!! haize labs🤝🤝🤝DSPy

1

0

6

Leonard Tang

@leonardtang_

6 months

@RonNachum one rule for life: think less do more

0

6

Leonard Tang

@leonardtang_

2 months

end of week claude existentialism à la spiderman meme

2

0

6

Leonard Tang

@leonardtang_

4 months

@robertwachen @hyhieu226 every model deserves a thorough Haizing

2

0

6

Leonard Tang

@leonardtang_

8 months

⁉️...but on a dataset of 18,250 harmful jokes from Reddit -- that directly violate OpenAI's, Google's, and Meta's usage policies -- these APIs perform abysmally. No API is able to flag even more than 50% of these harmful texts! (3/4)

3

0

6

Leonard Tang

@leonardtang_

6 months

@andruyeung if you exist in any capacity, do not hire a McKinsey consultant

0

5

Leonard Tang

@leonardtang_

3 months

@EMostaque @haizelabs

0

6

Leonard Tang

@leonardtang_

9 months

this Q* stuff is nothing new @GXiming had the idea to combine heuristics-based A* and prompting 2 years ago ()

0

6

Leonard Tang

@leonardtang_

6 months

fax

dima | n/acc

@dima_null

6 months

Frontend design/coding is a male version of knitting

2

1

20

1

0

6

Leonard Tang

@leonardtang_

4 months

bad day to be Llama 3☠️

Haize Labs

@haizelabs

4 months

last thursday, Meta dropped Llama 3, the OpenAI killer. no doubt a very impressive model! but over the weekend, we discovered an extremely trivial programmatic jailbreak against llama 3...sorry zuck!😘 so much for all that safety-tuning☹️ code:

15

80

481

1

0

6

Leonard Tang

@leonardtang_

5 months

just got booted off my college email....why @Google

2

0

6

Leonard Tang

@leonardtang_

5 months

f

0

5

Leonard Tang

@leonardtang_

5 months

bullish on nyc as the next big AI hub :^)

0

2

5

Leonard Tang

@leonardtang_

5 months

@khushkhushkhush what about at an early stage ai infra + research startup ?

1

0

5

Leonard Tang

@leonardtang_

10 months

@vectara you used ML to measure ML...?

1

0

5

Leonard Tang

@leonardtang_

3 months

@SanderSchulhoff thanks a ton for your support Sander!! still remember when we first met and you were incredulous that we got jailbreak GPT😆 collab soon!!

0

5