![Karsten Roth Profile](https://pbs.twimg.com/profile_images/1523987430466068481/w5WzWKjS_x96.jpg)
Karsten Roth
@confusezius
Followers
1K
Following
4K
Statuses
325
Large Models × {Lifelong, Data-Centric, Transfer}-Learning | PhD @ELLISforEurope 🇪🇺 w/ @zeynepakata & @OriolVinyalsML. Prev DeepMind, FAIR, AWS, Vector, Mila
Joined June 2019
Great to see this work being accepted at #ICLR2025 - it provides a wonderful new perspective on disentanglement through the lens of optimal transport!
Our work on geometric disentangled representation learning has been accepted to ICLR 2025! 🎊See you in Singapore if you want to understand this gif better :)
0
5
29
RT @theo_uscidda: Our work on geometric disentangled representation learning has been accepted to ICLR 2025! 🎊See you in Singapore if you w…
0
18
0
RT @theo_uscidda: Curious about the potential of optimal transport (OT) in representation learning? Join @CuturiMarco's talk at the UniReps…
0
27
0
RT @marksibrahim: Can we boost transformers’ ability to retrieve knowledge and plan in maze navigation by only tweaking the learning object…
0
4
0
RT @vishaal_urao: 🚀New Paper Model merging is the rage these days: simply fine-tune multiple task-specific models…
0
33
0
RT @alfcnz: The Practitioner's Guide to Continual Multimodal Pretraining @sbdzdz @confusezius @vishaal_urao
0
6
0
RT @LucaEyring: Excited to present ReNO at #NeurIPS2024 this week! Join us tomorrow from 11am-2pm at East Exhibit Hall A-C #1504! Addition…
0
3
0
How far can you push model merging over time, as more experts and options to model-merge arise? We comprehensively and systematically investigate this in our new work, check it out!
📄 New Paper: "How to Merge Your Multimodal Models Over Time?" Model merging assumes all finetuned models are available at once. But what if they need to be created over time? We study Temporal Model Merging through the TIME framework to find out! 🧵
0
2
14
RT @sbdzdz: 📄 New Paper: "How to Merge Your Multimodal Models Over Time?" Model merging assumes all finetuned mod…
0
15
0
Co-led with @vishaal_urao, and with a wonderful team: @sbdzdz @AmyPrb @mehdidc @OriolVinyalsML @olivierhenaff @SamuelAlbanie @MatthiasBethge @zeynepakata
0
0
2
@cloneofsimo From a continual learning perspective , this has been looked into as "stability gap" initially in as the pretrain-finetune/continuation shift becomes larger. We also saw this dependency for large-scale continual pretraining (
0
0
9
RT @nikparth1: Thrilled to share our latest work showing that “distillation through data” can be more effective than traditional knowledge-…
0
39
0
RT @vishaal_urao: 🚀New Paper: Active Curation Effectively Distills Multimodal Models Smol models are all the rage…
0
26
0
RT @alfcnz: Beautiful paper! 😍😍😍 Captions go above the tables, but otherwise aesthetically very pleasing.
0
3
0
RT @dimadamen: Thanks @olivierhenaff @ibalazevic for involving me in this great project!! Great work @confusezius.. Read our paper: Conte…
0
3
0