Chunyuan Deng @ChunyuanDeng profile

Chunyuan Deng

@ChunyuanDeng

Followers

137

Following

653

Statuses

133

Ph.D. Student @RiceCompSci. Not sure what day it is, but the models are grokking, so that's good.

Joined February 2022

Don't wanna be here? Send us removal request.

Chunyuan Deng

@ChunyuanDeng

21 days

RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…

0

176

0

Chunyuan Deng

@ChunyuanDeng

24 days

RT @megamor2: How can we interpret LLM features at scale? 🤔 Current pipelines use activating inputs, which is costly and ignores how featur…

0

26

0

Chunyuan Deng

@ChunyuanDeng

30 days

RT @xwang_lk: Do curiosity-driven research, impact can be a side effect. Do impact-driven research, anxiety is often a companion.

0

18

0

Chunyuan Deng

@ChunyuanDeng

2 months

@ZhongRuiqi @NeelNanda5 @rohinmshah @Yoshua_Bengio @jacobandreas relevant discussion re review process

Sasha Rush

@srush_nlp

2 months

I can't really imagine myself voluntarily reviewing a company's white paper outside a journal system. Is this common in other fields?

0

Chunyuan Deng

@ChunyuanDeng

2 months

RT @davidbau: PhD Applicants: remember that the Northeastern Computer Science PhD application deadline is Dec 15. It's a terrific time to…

0

55

0

Chunyuan Deng

@ChunyuanDeng

2 months

RT @violincase: 🎂 my 45th 🎂 A candle per year is my celebration of each rotation around the sun that I’ve been fortunate enough to experie…

0

110

0

Chunyuan Deng

@ChunyuanDeng

2 months

RT @jornywan: Thrilled to share that our latest work was accepted at @IEEEBHI and selected as Oral Presentation!🎉 We presented on Nov. 11.…

0

2

0

Chunyuan Deng

@ChunyuanDeng

2 months

RT @johnhewtt: I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlli…

0

156

0

Chunyuan Deng

@ChunyuanDeng

3 months

An interesting yet underrated thing that influences how people judge the quality of a paper is the speed at which they can read it. When I force myself to read faster, many papers seem much better to me.

0

7

Chunyuan Deng

@ChunyuanDeng

3 months

RT @katie_kang_: While models often memorize the entire finetuning set by end of training, we find that their learning progressions can dif…

0

1

0

Chunyuan Deng

@ChunyuanDeng

3 months

RT @LucaAmb: Geometric memorization (i.e. memorization of latent features) is a phenomenon where generative diffusion models lose degrees o…

0

86

0

Chunyuan Deng

@ChunyuanDeng

3 months

But we really think why CoT work in arithmetic is strongly related to decomposing an end-to-end space mapping into several linear combined substeps while the label space for current step is domain space for the next step. And mostly importantly, attention would handle this simultaneously.

0

Chunyuan Deng

@ChunyuanDeng

3 months

RT @hanjie_chen: Thank you for sharing our paper!🙏 "Language Models are Symbolic Learners in Arithmetic", led by my student @ChunyuanDeng✨…

0

5

0

Chunyuan Deng

@ChunyuanDeng

4 months

RT @BlackboxNLP: BlackboxNLP welcomes EMNLP Findings papers for (poster) presentation at our workshop! If you have a Findings paper on an…

0

6

0

Chunyuan Deng

@ChunyuanDeng

4 months

RT @srush_nlp: Keynote talk from Chris Manning to introduce COLM. Talks about the history of (L)LMs, relationship between language and inte…

0

64

0

Chunyuan Deng

@ChunyuanDeng

5 months

RT @jramapuram: Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention wi…

0

167

0