ChunyuanDeng Profile Banner
Chunyuan Deng Profile
Chunyuan Deng

@ChunyuanDeng

Followers
137
Following
653
Statuses
133

Ph.D. Student @RiceCompSci. Not sure what day it is, but the models are grokking, so that's good.

Joined February 2022
Don't wanna be here? Send us removal request.
@ChunyuanDeng
Chunyuan Deng
21 days
RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…
0
176
0
@ChunyuanDeng
Chunyuan Deng
24 days
RT @megamor2: How can we interpret LLM features at scale? 🤔 Current pipelines use activating inputs, which is costly and ignores how featur…
0
26
0
@ChunyuanDeng
Chunyuan Deng
30 days
RT @xwang_lk: Do curiosity-driven research, impact can be a side effect. Do impact-driven research, anxiety is often a companion.
0
18
0
@ChunyuanDeng
Chunyuan Deng
2 months
@ZhongRuiqi @NeelNanda5 @rohinmshah @Yoshua_Bengio @jacobandreas relevant discussion re review process
@srush_nlp
Sasha Rush
2 months
I can't really imagine myself voluntarily reviewing a company's white paper outside a journal system. Is this common in other fields?
0
0
0
@ChunyuanDeng
Chunyuan Deng
2 months
RT @davidbau: PhD Applicants: remember that the Northeastern Computer Science PhD application deadline is Dec 15. It's a terrific time to…
0
55
0
@ChunyuanDeng
Chunyuan Deng
2 months
RT @violincase: 🎂 my 45th 🎂 A candle per year is my celebration of each rotation around the sun that I’ve been fortunate enough to experie…
0
110
0
@ChunyuanDeng
Chunyuan Deng
2 months
RT @jornywan: Thrilled to share that our latest work was accepted at @IEEEBHI and selected as Oral Presentation!🎉 We presented on Nov. 11.…
0
2
0
@ChunyuanDeng
Chunyuan Deng
2 months
RT @johnhewtt: I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlli…
0
156
0
@ChunyuanDeng
Chunyuan Deng
3 months
An interesting yet underrated thing that influences how people judge the quality of a paper is the speed at which they can read it. When I force myself to read faster, many papers seem much better to me.
0
0
7
@ChunyuanDeng
Chunyuan Deng
3 months
RT @katie_kang_: While models often memorize the entire finetuning set by end of training, we find that their learning progressions can dif…
0
1
0
@ChunyuanDeng
Chunyuan Deng
3 months
RT @LucaAmb: Geometric memorization (i.e. memorization of latent features) is a phenomenon where generative diffusion models lose degrees o…
0
86
0
@ChunyuanDeng
Chunyuan Deng
3 months
But we really think why CoT work in arithmetic is strongly related to decomposing an end-to-end space mapping into several linear combined substeps while the label space for current step is domain space for the next step. And mostly importantly, attention would handle this simultaneously.
0
0
0
@ChunyuanDeng
Chunyuan Deng
3 months
RT @hanjie_chen: Thank you for sharing our paper!🙏 "Language Models are Symbolic Learners in Arithmetic", led by my student @ChunyuanDeng✨…
0
5
0
@ChunyuanDeng
Chunyuan Deng
4 months
RT @BlackboxNLP: BlackboxNLP welcomes EMNLP Findings papers for (poster) presentation at our workshop! If you have a Findings paper on an…
0
6
0
@ChunyuanDeng
Chunyuan Deng
4 months
RT @srush_nlp: Keynote talk from Chris Manning to introduce COLM. Talks about the history of (L)LMs, relationship between language and inte…
0
64
0
@ChunyuanDeng
Chunyuan Deng
5 months
RT @jramapuram: Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention wi…
0
167
0