Alan Jeffares Profile
Alan Jeffares

@Jeffaresalan

Followers
1K
Following
1K
Statuses
235

Multiplying matrices @Cambridge_Uni & @MSFTResearch | PhD student in Machine Learning | Previously MSc @ucl & BSc @ucddublin

London, UK
Joined May 2012
Don't wanna be here? Send us removal request.
@Jeffaresalan
Alan Jeffares
3 months
There are many things we don’t understand about deep learning. Our new NeurIPS paper (w/ @AliciaCurth) makes the mistake of trying to tackle too many of them 😅 A simplified model of deep learning describes double descent, grokking, gradient boosting & linear mode connectivity🧵
Tweet media one
15
134
760
@Jeffaresalan
Alan Jeffares
22 days
the two possible emotions when you read your ICLR decision notification
1
2
84
@Jeffaresalan
Alan Jeffares
2 months
@BrantonDeMoss @ml_norms sure, happy to chat
0
0
0
@Jeffaresalan
Alan Jeffares
2 months
I will be presenting this work today at 16:30 in East hall poster 2408! Drop by if you are interested in NTK, double descent, grokking, gradient boosting, or weight averaging 😅
@Jeffaresalan
Alan Jeffares
3 months
There are many things we don’t understand about deep learning. Our new NeurIPS paper (w/ @AliciaCurth) makes the mistake of trying to tackle too many of them 😅 A simplified model of deep learning describes double descent, grokking, gradient boosting & linear mode connectivity🧵
Tweet media one
1
15
114
@Jeffaresalan
Alan Jeffares
2 months
Lucas is an incredible mentor, i cannot recommend applying for this internship highly enough!
@LiyuanLucas
Liyuan Liu (Lucas)
2 months
Join Microsoft Research's Deep Learning team in Redmond as a Summer 2025 intern! 🎓 Apply at 📍 I'll be at #NeurIPS2024 next week - let's connect and chat! Please help us share this post in your networks : ) #DeepLearning #Internship #MSR
0
0
2
@Jeffaresalan
Alan Jeffares
3 months
@BlackHC then why are you packing for a holiday?!
1
0
1
@Jeffaresalan
Alan Jeffares
3 months
@AliciaCurth when your tl;dr gets a tl;dr 🫠
0
1
2
@Jeffaresalan
Alan Jeffares
3 months
@pmddomingos pretty accurate TL;DR, thank you!
1
0
5
@Jeffaresalan
Alan Jeffares
3 months
RT @AliciaCurth: Part 2: Why do boosted trees outperform deep learning on tabular data?? @Jeffaresalan &I suspected that answers are obfus…
0
101
0
@Jeffaresalan
Alan Jeffares
3 months
0
0
1
@Jeffaresalan
Alan Jeffares
3 months
I’m excited at the prospect of an alternative research social network where my feed won’t be musk, bots and porn dominated. @alanjeffares Bonus: I can finally correct my handle to the right order 😅
Tweet media one
0
0
6
@Jeffaresalan
Alan Jeffares
3 months
@itsstock @AliciaCurth ah yes, working on the tidy up now. will definitely be released before the conference. apologies for the lag 😅
1
0
1
@Jeffaresalan
Alan Jeffares
3 months
if you are looking for a tweet thread that is potentially longer than our actual NeurIPS paper (but also probably clearer?) check this out…
@AliciaCurth
Alicia Curth
3 months
From double descent to grokking, deep learning sometimes works in unpredictable ways.. or does it? For NeurIPS,@Jeffaresalan & I explored if&how statistics + smart linearisation can help us better understand&predict numerous odd deep learning phenomena — and learned a lot..🧵1/n
Tweet media one
0
1
22
@Jeffaresalan
Alan Jeffares
3 months
RT @AliciaCurth: From double descent to grokking, deep learning sometimes works in unpredictable ways.. or does it? For NeurIPS,@Jeffaresa
0
84
0
@Jeffaresalan
Alan Jeffares
3 months
@DamienTeney @AliciaCurth thank you very much!
0
0
0
@Jeffaresalan
Alan Jeffares
3 months
@JFPuget @_jason_today @3rp3l this is a great find actually, thank you! it’s still quite distinct from the model soups method, but is the earliest case of model merging neural networks that i’m aware of. also so cool to think back to the days when a neural network consisted of 4 hidden neurons!
0
0
0
@Jeffaresalan
Alan Jeffares
3 months
@arishabh8 @AliciaCurth Yes absolutely, we have some documentation and cleanup to get around to but it will definitely be released before the conference!
1
0
2
@Jeffaresalan
Alan Jeffares
3 months
@JFPuget @_jason_today @3rp3l I am obviously not going to be convinced by this. Checkpoint merging is not what model soups does, so this doesn't provide evidence that model souping has been "known for so many years on Kaggle". I am happy to leave the conversation there.
1
0
0
@Jeffaresalan
Alan Jeffares
3 months
@JFPuget @_jason_today @3rp3l Not only was it just an off-hand comment in a tweet, but it also didn't even claim to apply the same algorithm as model soups. There is (so far) no evidence of model soups being applied prior to its publication.
1
0
0
@Jeffaresalan
Alan Jeffares
3 months
@_jason_today @3rp3l @JFPuget Fair enough. I'm just pushing back against the original tweets unsubstansiated trope that a specific method was already used for years.
1
0
0