Many AI researchers today display signs of burn out. Companies are racing to build bigger models, individuals rush to publish more papers.
I miss the old days when things were slow. less noise, less hype. More time to savor.
Collectively as a community, are we happier?
My deep learning course now has lectures available online, please check out our YouTube channel:
Topics covered this fall: reliable deep learning, generalization, learning with less supervision, lifelong learning, deep generative models and more.
Suffering from overconfident softmax scores? Time to use energy scores!
Excited to release our NeurIPS paper on "Energy-based Out-of-distribution Detection", a theoretically motivated framework for OOD detection. 1/n
Paper: (w/ code included)
Sharing our new
#ICML2022
paper on Logit Normalization (LogitNorm), a simple fix to the cross-entropy loss that mitigates the overconfidence issue of deep neural networks. (1/n)
Paper: .
5-min video:
Can we align LLMs without retraining the model (e.g. using RLHF)?
Introducing 🔥ARGS🔥, a simple and powerful test-time alignment approach that leverages a reward model to "guide" your unaligned LLM in decoding time! 🧵(1/n)
#ICLR2024
Paper:
How can we make neural networks learn both the knowns and unknowns? Check out our
#ICLR2022
paper “VOS: Learning What You Don’t Know by Virtual Outlier Synthesis”, a general learning framework that suits both object detection and classification tasks. 1/n
Honored to receive the NSF CAREER award, which will support our work on new foundations for safe and long-term beneficial learning algorithms in the open world.
Thanks to all my students, collaborators, and panelists for making this job rewarding.
Lessons learned below: 1/n
Wrote a blog on automating the art of data augmentation, featuring latest works on the practice, theory and new direction of data augmentation from
@HazyResearch
. Check out on
@StanfordAILab
website:
Vision-language models such as CLIP is powerful in zero-shot classification. But do they know what they don’t know? We investigate the promises and AI safety of large pre-trained models when it comes to out-of-distribution data. 1/n
#NeurIPS2022
Paper:
If you are submitting to NeurIPS, please consider avoiding title like “X is all you need”. It goes against the spirit of science, which is acknowledging and exploring the many possibilities that we don’t know yet.
Grateful to be named as Innovator of the Year by MIT
@techreview
, for “pioneering research in the critical field of AI safety”.
It’s encouraging to witness the growth of the field over the years, with now an active community contributing to the space.
Embedding quality is the key to distance-based OOD detection. Here is a more elegant alternative to off-the-shelf contrastive loss. Sharing our
#ICLR2023
paper “How to Exploit Hyperspherical Embeddings for Out-of-Distribution Detection?”. 1/n
Arxiv:
My lab has Ph.D. openings starting in Fall 2023. I am looking for curiosity-driven students who are passionate about open-world machine learning.
The deadline to apply is December 15: .
Please help RT/spread the word. Thanks!
Time to move from CIFAR benchmarks towards OOD detection in a real-world setting!
Releasing our CVPR oral paper "MOS: Scaling Out-of-distribution Detection for Large Semantic Space” 1/n
Paper (w/ Rui Huang):
Blog:
Grateful to receive an Amazon Research Award for my proposal "Uncertainty-aware Deep Learning for Reliable Decision Making in an Open World" at
@WisconsinCS
. Learn more about the program on the
@AmazonScience
website:
#AmazonResearchAwards
.
Both of our
#CVPR2021
submissions are accepted, including one oral presentation. Thanks to our anonymous reviewer for saying "the paper is enjoyable to read" - worth all that polishing effort. Excited to release the paper and code sometime soon. Congrats to the team! :)
#ICLR2022
is here. Our paper PiCO has received ICLR Oustanding Paper Award (honorable mention). Congratulations to
@Haobo_zju
and the entire PiCO team!
The oral presentation is taking place today April 25: 7:15pm-7:30pm CDT.
Paper:
Given the rapid changes in AI, a major update was done for CS762 (Advanced Deep Learning). We will cover transformers, safety and alignment, foundation models and emergent behaviors, distributional shifts, and diffusion models.
Tentative fall schedule:
Incredibly honored to receive the
@AFOSR
Young Investigator Program award this year. This will support our effort on open-world machine learning and AI alignment. Many thanks to the program office and everyone who made this possible! I am grateful 💕
At
#ICML2024
, we are excited to present several new works on:
🔹 Persona In-context learning
🔹 LLM alignment theory
🔹 Task adaptation for Vision-Language Models
🔹 Out-of-Distribution Detection
Please join us in these poster sessions! My students Froilan
@HyeonggyuC
and
My student
@YiyouSun
has successfully defended his PhD thesis today!
Yiyou has made major contributions to the field of OOD detection and open-world ML, which advance and formalize our understanding in this area.
Congrats Dr. Sun, for this incredible journey 🍻
🔥Your Weak LLM is a Secret Alignment Powerhouse!
Scaling alignment with pure human or AI feedback (GPT-4) isn't cheap 💸. It demands enormous human effort or computing power. Our new study reveals a hidden gem: even weak LLMs with just 125M parameters can provide powerful
Can the feedback from a 'weak' LLM with only hundreds of millions of parameters rival that from humans and GPT-4 for alignment?🤔Yes! Our study shows even a 125M weak model can match or even outperform both! 🚀
Learn more: (w.
@SharonYixuanLi
)
Thread below
Peanut passed away yesterday. He created so many happy memories since he entered my life in November 2013. We witnessed each other’s growth, shared all the life transitions, from start of my PhD to work, from Ithaca to Madison. He is the sweetest friend and will be greatly missed
My student Yifei Ming
@ming5_alvin
successfully defended his PhD thesis today!
His thesis "Reliable Foundation Models in the Open World" addresses critical problems we face today in deploying large pre-trained models into the real world.
Here is an overview of representative
Ever wondered what the world looks like beyond your training data? Thrilled to release
@xuefeng_du
's latest
#NeurIPS2023
paper: DREAM-OOD, a cool framework for crafting photo-realistic OOD images from any in-distribution dataset. Dive in! [1/n]
📄 Paper:
Interested in learning about new works on out-of-distribution detection? Please join us and chat at the
#NeurIPS2021
poster session next week. Hope to see you there!
Kicked off my first lecture at
@WisconsinCS
today. What an interesting and strange time to start a tenure-track. Kudos to Blackboard Collaborate which has made the online teaching experience a breeze. I still like in-person talk and dynamics better (to at least see the crowd).
Early days of my grad school, watching TED talks gave me so much inspiration and taught me public speaking.
Tonight I am fortunate to have the opportunity to pay back and share my journey. It’s certainly a dream of my 20s coming true.
Thanks
#TEDxUWMadison
for the great event
Detecting LLM hallucinations is crucial for trust in AI-generated content. But how do we achieve this without massive annotated data?
Check out our
#NeurIPS2024
spotlight paper HaloScope🔍, a practical new framework leveraging unlabeled data from real-world chat-based
🚀Excited to share our NeurIPS 2024
@NeurIPSConf
spotlight HaloScope! 🎉 HaloScope is a new SOTA method that significantly improves hallucination detection for LLMs using unlabeled LLM generations 🧵
#NeurIPS2024
Paper: , w/
@ChaoweiX
,
@SharonYixuanLi
Alignment techniques are crucial for large language models like GPT, Llama, etc. Yet, the theoretical understanding of alignment is still in its infancy.
@shawnim00
's recent work, accepted by
#ICML2024
, takes an exciting step in this direction. Here, I reflect on our recent
Can we rigorously understand how models learn behaviors through preference learning (RLHF, DPO)? 🤔
We look into this question and find that the training dynamics have a way of prioritizing behaviors!
Paper:
[1/n]
Made it to Honolulu for
#ICML2023
! Waiting for the sunrise in darkness while being 5 hours jet-lagged. 🌄
Really excited to meet the new and old friends on the island.
Ps. We will be presenting 3 papers at the main conference. Stop by and chat or DM me for a coffee meetup?
This work is led by two incredibly talented undergraduate students:
@KhanovMax
a sophomore at UW Madison and recently won the prestigious Goldwater Scholarship
@top34051
spent junior & senior years with us and is now pursuing graduate study at
@Stanford
CS. He will be traveling
Can we align LLMs without retraining the model (e.g. using RLHF)?
Introducing 🔥ARGS🔥, a simple and powerful test-time alignment approach that leverages a reward model to "guide" your unaligned LLM in decoding time! 🧵(1/n)
#ICLR2024
Paper:
If you are looking for adventures before ICML starts, drive west all the way to the Ka’ena Point Trailhead (where the road ends) and hike along the shore. Keep an eye out for dolphins, we saw a dozen today.
Mountains are spectacular on the way too.
NSF (in partnership with Open Philanthropy and Good Ventures) is going to fund our new project on building foundations for safety-aware machine learning.
Glad to see more national-level initiatives emphasizing research on safe artificial intelligence.
Honored to receive a Facebook (now Meta) Research Award on safeguarding neural networks.
I am even more grateful that the industry world recognizes the importance of the problem and actively supports our effort in building reliable AI.
My students and collaborators will present 4 exciting papers on reliable ML at
#ICLR2024
. If you are attending in Vienna, please check them out! 🥰
1⃣ ARGS: Alignment as Reward-Guided Search (
@KhanovMax
,
@top34051
)
A test-time decoding framework that integrates alignment into
Exciting news! Jane Street has announced the winner's of its first Graduate Research Fellowship:
It was a great process, and we were all deeply impressed with the quality of the applicants.
Human values encompass far more than "helpfulness" and "harmlessness". They span broad personality traits, political views, moral beliefs, and beyond. Can we elicit diverse personas encoded in LLMs? Check out our
#ICML2024
paper PICLe: Persona In-Context Learning 🥒 (with
Can we modify the behavior of your LLM without training?
Introducing PICLe🥒
We elicit diverse personas from LLMs with just a few demonstrative examples!
#icml2024
Paper: (with
@SharonYixuanLi
)
[1/6]
Tomorrow I will give a talk at the Anomaly Detection for Scientific Discovery (AD4SD) Seminar. I will share some thoughts on the opportunities and challenges in out-of-distribution detection.
Talk info and link available:
Thanks
@techreview
@Melissahei
for this article. Humbled to be featured with this group of great minds working on some of the most pressing problems in AI and beyond.
This was the first paper Yiyou and I wrote together when he joined my group in fall 2020. I’ve always remembered the excitement we had in this idea, despite a few rejections. He has done several other excellent works ever since. What a journey, looking back.
Do we really need all those weight parameters for OOD detection?
Excited to share our
#ECCV2022
paper DICE – a new sparsification-based framework for OOD detection. 1/n
(joint work with
@SharonYixuanLi
)
Thanks to the senior faculty from
@WisconsinCS
for the surprise in mail! Such a thoughtfully curated collection of local gift and a warmly written letter. Made my day!
I am giving an invited talk at
@eccvconf
workshops today on "How to Handle Data Shifts? Challenges, Research Progress, and Path Forward". Join us at:
1. Uncertainty Quantification for CV:
2. Learning from Limited and Imperfect Data:
A comprehensive survey on OOD detection and beyond. We hope this can be a useful resource for you to learn about the connections and differences among these topics, and find relevant literature in one place.
Outlier Detection? Anomaly Detection? Novelty Detection? Open Set Recognition? OOD Detection?
🤨 What are they?🤔 Are they different?🧐 How to solve them?😕
Check out our latest survey "Generalized OOD Detection" to answer them all!
📢Excited to share our latest
#ICML2024
paper, which bridges understanding between classical anomaly detection and modern OOD, revealing the importance of leveraging labeled ID data.
Ensuring the reliability of machine learning involves detecting data points straying from the
Anomaly detection and OOD detection have been widely studied, but differ in the use of in-distribution (ID) labels during training. This raises a fundamental question: How and when does ID label help OOD detection? 💡 Our
#ICML2024
paper provides a formal understanding on this!
Proud advisor moment - congratulations to
@YiyouSun
on this new life chapter! Nothing makes me happier than seeing you embark on the academic journey. Keep rocking and doing more awesome works at
@UCBerkeley
I will join
@UCBerkeley
as a postdoctoral researcher working with Prof. Dawn Song (
@dawnsongtweets
) in the Fall 2024 semester! My focus will be on developing new techniques and tools for trustworthy LLM with greater safety. Looking forward to exploring new collaborations!
I had fun attending the MIT conference on mechanistic interpretability and AI safety.
Small and focused conference is a charm (and *every* participant gets the the time to introduce themselves)
My talk and full program is available at :
Thank you
@Casmi_NU
for featuring our
#ACL2023
work "Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection”
This is also
@RUppaal
's first paper with the lab - really excited for her!
Congratulations to my student
@shreym0di
for winning the prestigious David DeWitt Undergraduate Scholarship! This is the premier and most competitive scholarship for undergraduates in Computer Science, and it recognizes academic excellence within our department.
In Shrey’s own
As a follow-up,
@ming5_alvin
and I looked into this problem since last fall:
"How does fine-tuning impact OOD detection in vision-language models"?
Our findings are now summarized in this article:
Vision-language models such as CLIP is powerful in zero-shot classification. But do they know what they don’t know? We investigate the promises and AI safety of large pre-trained models when it comes to out-of-distribution data. 1/n
#NeurIPS2022
Paper:
Flying for
#ICML2022
today. Let’s catch up if you are also attending in person. I will
- Share a few new works on OOD detection
- Give a talk at the DataPerf workshop (7/22)
- Help
@ml_angelopoulos
and
@stats_stephen
organize the
#DFUQ
workshop (7/23)
See you there!
CS Visit Weekend is happening tmr! If you are attending, I highly encourage you to interact with our faculty and students. I'd be glad to answer questions about
@WisconsinCS
or my research. The best way to make an informative decision is to talk to people and get perspectives :)
Yifei and Ying will be giving their ICML oral talk this afternoon.
We formalize outlier mining as a sequential decision-making problem and show Thompson sampling can effectively balance exploitation vs. exploration for selecting informative outliers.
Detecting malicious user prompt is an important problem when deploying VLMs in the real world,
@xuefeng_du
’s recent work in collaboration with
@MSFTResearch
has provided a promising solution to safeguard the VLMs. Check out his thread for details!
Attending
#ICML2020
? Check out our poster at Uncertainty & robustness workshop (UDL), and learn about our latest SOTA results on using informative outlier mining for out-of-distribution detection :-) The session starts at 9am PT. Hope to see you there!
Our NeurIPS'21 workshop on "ImageNet: past, present, and future" has been accepted!
I'm excited about our speaker line-up. I'm even more excited to see what papers researchers will submit to the workshop!
Please spread the word, and consider submitting.
The video recording of my talk at MLOS seminar is available online! You can learn a series of our works on the algorithm and theory of out-of-distribution detection in 1 hour:
(1) ODIN
(2) Energy OOD
(3) ATOM (informative outlier matters, current SoTA)
✨Interested in understanding how well LLMs can learn from human preferences? Our recent paper introduces a new theoretical framework to analyze the generalization behavior of Direct Preference Optimization (DPO)!
This is the first work to provide a rigorous generalization bound
Learning from human preferences is key to building safe models, but how well does the model learn these preferences?
📍We develop a theoretical framework to analyze the generalization of preferences learned with direct preference optimization (DPO) after finite steps.
[1/n]
Great to see the enthusiasm in our work on unknown-aware object detection using videos in the wild. Work led by my awesome student
@xuefeng_du
. We sadly missed
#CVPR2022
in person but the oral talk is recorded online. Get in touch with us if you are interested in chatting.
Excited to release our
#CVPR2022
oral paper STUD, a powerful unknown-aware object detection framework that safeguards against OOD objects. STUD is the first to leverage videos in the wild and teaches models to tell apart known and unknowns. (1/n)
Paper: .
Thank you
@amfam
and
@datascience_uw
for sponsoring our research efforts on building
#ResponsibleAI
. Both estimating distributional uncertainty and debiasing ML are timely problems to tackle.
.
@amfam
has partnered with
@UWMadison
through the American Family Insurance Data Science Institute to offer mini-grants for data science research. Nearly $3 million has been awarded to 21 teams since 2020. Learn about Round 3 awards:
We are pleased to have our very own Sharon Li
@SharonYixuanLi
@WisconsinCS
as our speaker of the seminar this week. She will tell us about "Uncovering the Unknowns of Deep Neural Networks: Challenges and Opportunities" on Dec 1 at noon ET.
@UWMadPhysics
🚨The Distribution-free Uncertainty Quantification ICML workshop kicks off tomorrow!🚨 Leading off the morning session will be Rina Barber, Michael Jordan, Vladimir Vovk, Larry Wasserman, and Leying Guan.
I took my first lecture at Cornell with this wise man, who later became my phd advisor. He would always be in his office by 9am (even if he was on an international flight the day before). He stands for what a dedicated and disciplined career truly means.
Congrats on the last lecture in a long and distinguished career to John Hopcroft! I've gotten to spend a bit of time w/John at
@HLForum
over last few years, & he's a delight to be around.
(Sorry the last lecture wasn't in a classroom filled w/students to send you off properly!)
Thrilled to share our latest work in
@SciReports
on conversational AI. We assessed how
#GPT
interacts with diverse social groups on science & social issues, introduced an equity framework and shared our datasets. Full study is here:
#scicomm
#OpenAI
#HCI
(3/) In contrast, we show mathematically that softmax confidence score is a biased scoring function that is not aligned with the density of the inputs and hence is not suitable for OOD detection.
Very excited to share this work on Model Patching—an end-to-end framework for improving robustness against subgroup differences, with benefits on a real-world skin cancer classification task. Thanks to my amazing collaborators
@krandiash
, Albert Gu and
@HazyResearch
!
Preprint alert!
"Model Patching: Closing the Subgroup Performance Gap with Data Augmentation" is now on arXiv!
📑Paper:
🧑💻Code:
📹Video:
✍️Blog:
Read on to learn more (1/9)
In our Thanksgiving lab social,
@berylSreya
inspired me to give shoutouts to every student, for all the hard work, enthusiasm, and inspiration they bring us. The messages are the tokens of appreciation for how grateful I am to work with these awesome students at
@WisconsinCS
.
I will be giving a talk at Women in Computer Science (WiSC) at Stanford next Wednesday. I will talk about research on open-world machine learning and will stick around for a casual Q&A at the end. Look forward to this and thanks
@StanfordWiCS
for organizing!
If you are on the job market this year, consider applying! The living quality is a real charm (having spent years in both east and west coast before I moved here).
Our department
@WisconsinCS
is looking for faculty at all levels. Reach out if you need more information.
FYI, Madison was rated as
#1
city livable city. Just saying:-)
ICLR used to be a community of a few hundred people (as I was reminded by this old fb group).
Almost 10 years later, 5k or even more submissions. 🫣 Good luck to those of you who are working on the deadline!
Highly recommend binge watching the new Netflix series Queen's Gambit, beautifully filmed, stunningly acted, and a pretty accurate portrayal of what it is like to be a child chess prodigy. Not surprising given that the brilliant
@Kasparov63
was a consultant on it!
#QueensGambit
If you are at
@CVPR
, please join us at the poster session (paper 6254 & 6415)!
Thanks to our audience for the great questions and conversations. The morning session gave me so much to think about.
#CVPR2021
Time to move from CIFAR benchmarks towards OOD detection in a real-world setting!
Releasing our CVPR oral paper "MOS: Scaling Out-of-distribution Detection for Large Semantic Space” 1/n
Paper (w/ Rui Huang):
Blog:
Julia is giving an excellent talk at
#ICML2022
on “training OOD detectors in their natural habitats”, a new framework that leverages wild data for practical OOD detection.
Joint work with Julian Katz-Samuels, Julia Nakhleh and
@rdnowak
. Full paper:
Join us tonight at the WiML Un-Workshop "Does your model know what it doesn’t know? Uncertainty estimation and OOD detection in DL".
@polkirichenko
@AkramiHaleh
and
@jessierenjie
will lead with an excellent tutorial talk, followed by breakout sessions. Pop in and chat together?
Check out FDIT - a cool and well-executed idea that brings classic signal processing techniques to modern image translation. Fourier space can do wonders.
(2/) Joint work w/ Weitang Liu, Xiaoyun Wang, and John Owens. We show that energy is desirable for OOD detection since it is provably aligned with the probability density of the input—samples with higher energies can be interpreted as data with a lower likelihood of occurrence.
(9/) LogitNorm can be easily adopted in practice. It is straightforward to implement with existing deep learning frameworks, and does not require sophisticated changes to the loss or training scheme. Code and data are publicly available at .
Students in my deep learning class generated this image using DALL·E 2, with the prompt "Happy Thanksgiving, students at University of Wisconsin - Madison". Pretty cool with the State Capitol in the background. (slide credit:
@YepengJ
, Sijia Fang and Jiahao Fan).
happy holidays!
Data augmentation is crucial for machine learning, so how do we do it in a principled way?
Check out our latest blog post courtesy of
@SharonYixuanLi
and
@HazyResearch
new algorithms for automating the search process of augmentation techniques.
Look forward to your participation in
#UDL2020
this Friday! We are collecting questions for our panel discussion (11:30am-12:30pm PT), please submit here: .
I had an awesome time serving as the
#WiML2020
mentor this year. Thanks to my co-mentor
@djhsu
, Table 34 participants, and organizers of
@WiMLworkshop
for making the NeuIPS experience memorable.
We would like to express our sincere gratitude to the inspiring cohort of
#WiML2020
MENTORS! We have over 120 mentors from academia & industry! Mentorship roundtables are divided into 3 areas Research (Tables 1–28), Career & Life Advice (Tables 29–50) and Sponsor (Tables 51–63).
(4/) Importantly, energy score can be derived from a purely discriminative classification model without relying on a density estimator explicitly, and therefore circumvents the difficult optimization process in training generative-based models such as JEM.
I will be giving a talk at the National Institute of Standards and Technology on December 8. The open colloquia series discuss issues to help advance the state-of-the-art in AI measurement and evaluation.
Schedule below: