Alessio Devoto @devoto_alessio profile

Alessio Devoto

@devoto_alessio

Followers

365

Following

238

Statuses

137

PhD in Data Science at @SapienzaRoma | Researching Efficient ML/AI ☘️ | Visiting @EdinburghNLP | https://t.co/wcDDNFdyW9 | Also on 🦋

Rome, Lazio

Joined February 2022

Don't wanna be here? Send us removal request.

Alessio Devoto

@devoto_alessio

8 months

A simple L₂ norm-based strategy can compress KV caches by up to 90% without sacrificing accuracy! 🚀 In we find the attention score of a KV pair is very correlated to the Key Embedding’s L₂ norm! Super fun project w/ @yuzhaouoe @s_scardapane @pminervini

4

31

153

Alessio Devoto

@devoto_alessio

16 hours

RT @PMinervini: 🚀🚀🚀🚀

0

2

0

Alessio Devoto

@devoto_alessio

16 hours

RT @aryopg: Funny that PhD-level LLMs struggle to read an analog clock. Perhaps, LLMs are zoomers 🙃

0

3

0

Alessio Devoto

@devoto_alessio

16 hours

RT @tomgoldsteincs: New open source reasoning model! Huginn-3.5B reasons implicitly in latent space 🧠 Unlike O1 and R1, latent reasoning…

0

188

0

Alessio Devoto

@devoto_alessio

6 days

RT @jan_dubinski_: 🚨 Image AutoRegressive Models Leak More Training Data Than Diffusion Models🚨 IARs — like the #NeurIPS2024 Best Paper —…

0

7

0

Alessio Devoto

@devoto_alessio

6 days

A Geometric Framework for Understanding Memorization in Generative Models

0

Alessio Devoto

@devoto_alessio

6 days

RT @cgarciae88: The JAX team just released this amazing book on how to scale LLMs. It contains 11 chapters in total, and it goes into very…

0

33

0

Alessio Devoto

@devoto_alessio

7 days

The super weight in LLMs: Massive Activations in LLMs:

0

1

Alessio Devoto

@devoto_alessio

10 days

RT @CSProfKGD: Next stop arXiv cleaner

0

39

0

Alessio Devoto

@devoto_alessio

10 days

RT @PMinervini: @sama On a side note — MMLU contains a lot of errors (e.g. more than half of the Virology questions are wrong); you guys sh…

0

6

0

Alessio Devoto

@devoto_alessio

10 days

RT @PontiEdoardo: I have a scholarship for a PhD on efficient memory and tokenization in LLMs at @EdinburghNLP! Eligibility: UK home fee s…

0

21

0

Alessio Devoto

@devoto_alessio

14 days

RT @PMinervini: Our work on fixing and improving MMLU ("Are We Done with MMLU?", NAACL 2025) is featured on the De…

0

12

0

Alessio Devoto

@devoto_alessio

19 days

RT @clairebarale: MMLU-Redux will be at #NAACL2025

0

5

0

Alessio Devoto

@devoto_alessio

19 days

RT @alberto_mancino: I am so happy to share that our work "Are We Done with MMLU?" has been at the #NAACL2025 main conference! 🚀🚀 As alrea…

0

6

0

Alessio Devoto

@devoto_alessio

19 days

RT @MoRezaMadani: "Are We Done with MMLU" made it to #NAACL2025. Massive congrats to the team especially @aryopg. 🚀

0

3

0

Alessio Devoto

@devoto_alessio

19 days

RT @aryopg: Just sharing happy news! 🎉 1 paper has been accepted to #ICLR2025 and 3 papers to #NAACL2025! 🥳 I'm so grateful to have collabo…

0

5

0

Alessio Devoto

@devoto_alessio

19 days

RT @PMinervini: Massive congrats to the squad for the three NAACL 2025 (@naacl, @naaclmeeting, #NAACL) papers! Topics range from fixing MML…

0

18

0

Alessio Devoto

@devoto_alessio

20 days

RT @PMinervini: MMLU-Redux ❤️ ( "Are We Done with MMLU?") @aryopg

0

4

0

Alessio Devoto

@devoto_alessio

21 days

Ah I forgot the links 😅 Adaptive Length Tokenization: Learning from Scaling Tokenizers:

0