![Hanjie Chen Profile](https://pbs.twimg.com/profile_images/1831904423292940288/My5t0Pj__x96.jpg)
Hanjie Chen
@hanjie_chen
Followers
3K
Following
453
Statuses
131
Assistant Professor @RiceCompSci, Postdoc @jhuclsp, PhD @CS_UVA, former intern @allen_ai, @MSFTResearch, @IBM, #NLProc
Joined December 2019
RT @chrmanning: Re: “Every major breakthrough in AI has been American”: America does itself no favors when it overestimates its specialnes…
0
348
0
RT @Yushun_Dong: 🚀✨ Our Political-LLM is now live on arXiv! ✨🚀 🤝 A collaboration of 40+ researchers across ML and PoliSci exploring how LL…
0
25
0
RT @BlackboxNLP: We wrap up the day with a panel on the state of interpretability with 4 fantastic panellists: @jack_merullo_, @_dieuwke_,…
0
5
0
RT @boknilev: Panel discussion at blackboxNLP with @jack_merullo_ @_dieuwke_ @mariusmosbach @QVeraLiao moderated by @hanjie_chen
#EMNLP202…
0
2
0
@mdredze @AlexanderSpangh @sebgehr @VioletNPeng @TechAtBloomberg @HopkinsEngineer @jhuclsp @HopkinsDSAI @adveisner @ben_vandurme @YunmoChen Congratulations!🎉
0
0
3
RT @sarahwiegreffe: I’m giving a talk about this paper today in the 3- 3:30 session @BlackboxNLP (Jasmine ballroom) and will stick around f…
0
7
0
RT @BlackboxNLP: After an inspiring keynote by @hima_lakkaraju, we continue with an oral presentation by @lieberum_t on "Gemma Scope: Open…
0
3
0
RT @BlackboxNLP: We continue with our first oral presentation of the day, with Stefan Arnold presenting their work on "Routing in Sparsely-…
0
3
0
RT @srush_nlp: Talk: Speculations on Test-Time Scaling A tutorial on the technical aspects behind OpenAI's o1 and open research questions…
0
55
0
RT @srush_nlp: Working Talk: Speculations on Test-Time Scaling Slides: (A lot of smart folks thinking about this…
0
108
0
RT @BlackboxNLP: During BlackboxNLP we will hold a panel discussion on the state of interpretability in NLP. We are soliciting questions f…
0
4
0
RT @sharonlevy21: I am recruiting 1-2 PhD students this cycle @RutgersCS to work on Responsible NLP topics! I will be at #EMNLP2024 next w…
0
107
0
RT @ChunyuanDeng: Thx for sharing for our recent work ⚖️ Seems like there is still no magic. Position-level acc are U-curved (the hardes…
0
1
0
Thank you for sharing our paper!🙏 "Language Models are Symbolic Learners in Arithmetic", led by my student @ChunyuanDeng✨ 📎
The paper says, LLMs approach arithmetic as position-based pattern matchers, rather than computational engines. So LLMs learn arithmetic through symbolic abstraction instead of mathematical calculations Original Problem 🔍: LLMs struggle with basic arithmetic despite excelling in complex math problems. Previous research focused on identifying model components responsible for arithmetic learning but failed to explain why advanced models still struggle with certain arithmetic tasks. ----- Solution in this Paper 🛠️: • Introduced a subgroup-level framework to analyze how LLMs handle arithmetic learning • Broke down arithmetic tasks into two aspects: 👉 Subgroup complexity: measured through domain space size, label space entropy, and subgroup quality 👉 Subgroup selection: how LLMs choose input-output token mappings during training • Tested four different multiplication methods to investigate if LLMs use partial products • Analyzed position-level accuracy across different training sizes ----- Key Insights 💡: • LLMs don't perform actual calculations but function as pure symbolic learners • Label space entropy is crucial for task complexity measurement • LLMs show U-shaped accuracy curve - high accuracy (95%+) at first/last digits, low (<10%) in middle positions • Models select subgroups following an easy-to-hard paradigm during learning • Explicit training on partial products doesn't improve multiplication performance ----- Results 📊: • Models improved in identifying partial products after training but failed to leverage them for calculations • When subgroup complexity remained fixed, LLMs treated different arithmetic operations similarly • Reducing entropy through modular operations improved accuracy • Position-level accuracy followed consistent U-shaped pattern across different training sizes • Both Gemma-2-2B and Llama-3.1-8B showed similar symbolic learning patterns
2
5
42
RT @BlackboxNLP: BlackboxNLP welcomes EMNLP Findings papers for (poster) presentation at our workshop! If you have a Findings paper on an…
0
6
0
RT @sarahwiegreffe: ✨I am on the faculty job market for the 2024-2025 cycle!✨ I’m at COLM @COLM_conf until Wednesday evening. Would love t…
0
48
0