Chunyuan Deng
@ChunyuanDeng
Followers
137
Following
653
Statuses
133
Ph.D. Student @RiceCompSci. Not sure what day it is, but the models are grokking, so that's good.
Joined February 2022
RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…
0
176
0
RT @megamor2: How can we interpret LLM features at scale? 🤔 Current pipelines use activating inputs, which is costly and ignores how featur…
0
26
0
RT @xwang_lk: Do curiosity-driven research, impact can be a side effect. Do impact-driven research, anxiety is often a companion.
0
18
0
@ZhongRuiqi @NeelNanda5 @rohinmshah @Yoshua_Bengio @jacobandreas relevant discussion re review process
I can't really imagine myself voluntarily reviewing a company's white paper outside a journal system. Is this common in other fields?
0
0
0
RT @davidbau: PhD Applicants: remember that the Northeastern Computer Science PhD application deadline is Dec 15. It's a terrific time to…
0
55
0
RT @violincase: 🎂 my 45th 🎂 A candle per year is my celebration of each rotation around the sun that I’ve been fortunate enough to experie…
0
110
0
RT @johnhewtt: I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlli…
0
156
0
RT @katie_kang_: While models often memorize the entire finetuning set by end of training, we find that their learning progressions can dif…
0
1
0
RT @LucaAmb: Geometric memorization (i.e. memorization of latent features) is a phenomenon where generative diffusion models lose degrees o…
0
86
0
RT @hanjie_chen: Thank you for sharing our paper!🙏 "Language Models are Symbolic Learners in Arithmetic", led by my student @ChunyuanDeng✨…
0
5
0
RT @BlackboxNLP: BlackboxNLP welcomes EMNLP Findings papers for (poster) presentation at our workshop! If you have a Findings paper on an…
0
6
0
RT @srush_nlp: Keynote talk from Chris Manning to introduce COLM. Talks about the history of (L)LMs, relationship between language and inte…
0
64
0
RT @jramapuram: Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention wi…
0
167
0