confusezius Profile Banner
Karsten Roth Profile
Karsten Roth

@confusezius

Followers
1K
Following
4K
Statuses
325

Large Models × {Lifelong, Data-Centric, Transfer}-Learning | PhD @ELLISforEurope 🇪🇺 w/ @zeynepakata & @OriolVinyalsML. Prev DeepMind, FAIR, AWS, Vector, Mila

Joined June 2019
Don't wanna be here? Send us removal request.
@confusezius
Karsten Roth
3 months
🤔Can you turn your vision-language model from a great zero-shot model to a great-at-any-shot generalist? Turns out you can, and here is how: Really excited to share our latest work on multimodal pretraining! 🧵A short and hopefully informative thread:
Tweet media one
Tweet media two
Tweet media three
3
33
142
@confusezius
Karsten Roth
21 days
@yifeiwang77 Congrats Yifei!
1
0
3
@confusezius
Karsten Roth
21 days
Great to see this work being accepted at #ICLR2025 - it provides a wonderful new perspective on disentanglement through the lens of optimal transport!
@theo_uscidda
Théo Uscidda
21 days
Our work on geometric disentangled representation learning has been accepted to ICLR 2025! 🎊See you in Singapore if you want to understand this gif better :)
0
5
29
@confusezius
Karsten Roth
21 days
RT @theo_uscidda: Our work on geometric disentangled representation learning has been accepted to ICLR 2025! 🎊See you in Singapore if you w…
0
18
0
@confusezius
Karsten Roth
2 months
RT @theo_uscidda: Curious about the potential of optimal transport (OT) in representation learning? Join @CuturiMarco's talk at the UniReps…
0
27
0
@confusezius
Karsten Roth
2 months
RT @marksibrahim: Can we boost transformers’ ability to retrieve knowledge and plan in maze navigation by only tweaking the learning object…
0
4
0
@confusezius
Karsten Roth
2 months
RT @vishaal_urao: 🚀New Paper Model merging is the rage these days: simply fine-tune multiple task-specific models…
0
33
0
@confusezius
Karsten Roth
2 months
@y_m_asano Thanks Yuki! ❤️
0
0
3
@confusezius
Karsten Roth
2 months
RT @alfcnz: The Practitioner's Guide to Continual Multimodal Pretraining @sbdzdz @confusezius @vishaal_urao
Tweet media one
0
6
0
@confusezius
Karsten Roth
2 months
RT @LucaEyring: Excited to present ReNO at #NeurIPS2024 this week! Join us tomorrow from 11am-2pm at East Exhibit Hall A-C #1504! Addition…
0
3
0
@confusezius
Karsten Roth
2 months
How far can you push model merging over time, as more experts and options to model-merge arise? We comprehensively and systematically investigate this in our new work, check it out!
@sbdzdz
Sebastian Dziadzio @NeurIPS
2 months
📄 New Paper: "How to Merge Your Multimodal Models Over Time?" Model merging assumes all finetuned models are available at once. But what if they need to be created over time? We study Temporal Model Merging through the TIME framework to find out! 🧵
0
2
14
@confusezius
Karsten Roth
2 months
RT @sbdzdz: 📄 New Paper: "How to Merge Your Multimodal Models Over Time?" Model merging assumes all finetuned mod…
0
15
0
@confusezius
Karsten Roth
2 months
@cloneofsimo From a continual learning perspective , this has been looked into as "stability gap" initially in as the pretrain-finetune/continuation shift becomes larger. We also saw this dependency for large-scale continual pretraining (
0
0
9
@confusezius
Karsten Roth
2 months
RT @nikparth1: Thrilled to share our latest work showing that “distillation through data” can be more effective than traditional knowledge-…
0
39
0
@confusezius
Karsten Roth
2 months
RT @vishaal_urao: 🚀New Paper: Active Curation Effectively Distills Multimodal Models Smol models are all the rage…
0
26
0
@confusezius
Karsten Roth
3 months
@_sam_sinha_ Thanks ❤️‍🔥
0
0
2
@confusezius
Karsten Roth
3 months
RT @alfcnz: Beautiful paper! 😍😍😍 Captions go above the tables, but otherwise aesthetically very pleasing.
0
3
0
@confusezius
Karsten Roth
3 months
RT @dimadamen: Thanks @olivierhenaff @ibalazevic for involving me in this great project!! Great work @confusezius.. Read our paper: Conte…
0
3
0