![Colin Raffel Profile](https://pbs.twimg.com/profile_images/1046149372155555840/zSicyLgo_x96.jpg)
Colin Raffel
@colinraffel
Followers
31K
Following
3K
Statuses
2K
RT @Ar_Douillard: now that ICML deadline is over, time to submit to the MCDC workshop for ICLR!
0
8
0
RT @Ar_Douillard: Workshop alert 🚨 We'll host in ICLR 2025 a workshop on modularity, encompassing collaborative + decentralized + continua…
0
38
0
@giffmana I dunno, back in the day during my PhD we also had gaming desktop machines with gaming GPUs in our lab and we also called them "servers". I think any computer that is used for long-running jobs/experiments that you mainly use by ssh'ing into should be called a "server".
1
0
10
RT @AdaptiveML: Instead of mitigating length bias in LLM-as-judge, what if you could simply 🙋ask models to output comparisons of the same l…
0
3
0
@JayAlammar We have always referred to this diagram as the "octopus". I used to keep an informal list of all of the papers that had an octopus-style diagram in it.
0
1
22
RT @mciccone_AI: 🚨 Life update 🚨 I moved to Toronto 🇨🇦and joined @VectorInst as a Postdoctoral Fellow to work with @colinraffel and his lab…
0
4
0
RT @prateeky2806: I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modul…
0
59
0
RT @prateeky2806: We just released our survey on "Model MoErging", But what is MoErging?🤔Read on! Imagine a world where fine-tuned model…
0
45
0
RT @arankomatsuzaki: 🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from th…
0
109
0
@madiator Good question. I think 1) bandwagonism/inertia (it's anti-zeitgeist) and 2) it works well for classification tasks and is less proven for open-ended generation. But I've heard T-few has been implemented and is in use by various LLM startups, they just don't advertise it as such.
0
0
2
RT @ada_rob: I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ…
0
104
0
RT @AlbalakAlon: {UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍 Training data is a clos…
0
77
0
@jeremyphoward @Muqeeth10 @liu_haokun Lots more work coming from us along these lines! Would love to sync up sometime.
1
0
0
@sivil_taram Thank you! It took us a long time - turns out to be challenging in the zero-shot setting. The LoraHub approach makes a lot of sense in the few-shot setting. Ultimately both settings are very important!
1
0
5