devoto_alessio Profile Banner
Alessio Devoto Profile
Alessio Devoto

@devoto_alessio

Followers
365
Following
238
Statuses
137

PhD in Data Science at @SapienzaRoma | Researching Efficient ML/AI ☘️ | Visiting @EdinburghNLP | https://t.co/wcDDNFdyW9 | Also on 🦋

Rome, Lazio
Joined February 2022
Don't wanna be here? Send us removal request.
@devoto_alessio
Alessio Devoto
8 months
A simple L₂ norm-based strategy can compress KV caches by up to 90% without sacrificing accuracy! 🚀 In we find the attention score of a KV pair is very correlated to the Key Embedding’s L₂ norm! Super fun project w/ @yuzhaouoe @s_scardapane @pminervini
Tweet media one
4
31
153
@devoto_alessio
Alessio Devoto
16 hours
RT @PMinervini: 🚀🚀🚀🚀
0
2
0
@devoto_alessio
Alessio Devoto
16 hours
RT @aryopg: Funny that PhD-level LLMs struggle to read an analog clock. Perhaps, LLMs are zoomers 🙃
0
3
0
@devoto_alessio
Alessio Devoto
16 hours
RT @tomgoldsteincs: New open source reasoning model! Huginn-3.5B reasons implicitly in latent space 🧠 Unlike O1 and R1, latent reasoning…
0
188
0
@devoto_alessio
Alessio Devoto
6 days
RT @jan_dubinski_: 🚨 Image AutoRegressive Models Leak More Training Data Than Diffusion Models🚨 IARs — like the #NeurIPS2024 Best Paper —…
0
7
0
@devoto_alessio
Alessio Devoto
6 days
A Geometric Framework for Understanding Memorization in Generative Models
0
0
0
@devoto_alessio
Alessio Devoto
6 days
RT @cgarciae88: The JAX team just released this amazing book on how to scale LLMs. It contains 11 chapters in total, and it goes into very…
0
33
0
@devoto_alessio
Alessio Devoto
7 days
The super weight in LLMs: Massive Activations in LLMs:
0
0
1
@devoto_alessio
Alessio Devoto
10 days
RT @CSProfKGD: Next stop arXiv cleaner
0
39
0
@devoto_alessio
Alessio Devoto
10 days
RT @PMinervini: @sama On a side note — MMLU contains a lot of errors (e.g. more than half of the Virology questions are wrong); you guys sh…
0
6
0
@devoto_alessio
Alessio Devoto
10 days
RT @PontiEdoardo: I have a scholarship for a PhD on efficient memory and tokenization in LLMs at @EdinburghNLP! Eligibility: UK home fee s…
0
21
0
@devoto_alessio
Alessio Devoto
14 days
RT @PMinervini: Our work on fixing and improving MMLU ("Are We Done with MMLU?", NAACL 2025) is featured on the De…
0
12
0
@devoto_alessio
Alessio Devoto
19 days
RT @clairebarale: MMLU-Redux will be at #NAACL2025
0
5
0
@devoto_alessio
Alessio Devoto
19 days
RT @alberto_mancino: I am so happy to share that our work "Are We Done with MMLU?" has been at the #NAACL2025 main conference! 🚀🚀 As alrea…
0
6
0
@devoto_alessio
Alessio Devoto
19 days
RT @MoRezaMadani: "Are We Done with MMLU" made it to #NAACL2025. Massive congrats to the team especially @aryopg. 🚀
0
3
0
@devoto_alessio
Alessio Devoto
19 days
RT @aryopg: Just sharing happy news! 🎉 1 paper has been accepted to #ICLR2025 and 3 papers to #NAACL2025! 🥳 I'm so grateful to have collabo…
0
5
0
@devoto_alessio
Alessio Devoto
19 days
RT @PMinervini: Massive congrats to the squad for the three NAACL 2025 (@naacl, @naaclmeeting, #NAACL) papers! Topics range from fixing MML…
0
18
0
@devoto_alessio
Alessio Devoto
20 days
RT @PMinervini: MMLU-Redux ❤️ ( "Are We Done with MMLU?") @aryopg
0
4
0
@devoto_alessio
Alessio Devoto
21 days
Ah I forgot the links 😅 Adaptive Length Tokenization: Learning from Scaling Tokenizers:
0
0
0