julien_siems Profile Banner
Julien Siems Profile
Julien Siems

@julien_siems

Followers
187
Following
384
Statuses
50

PhD student advised by Frank Hutter working on in-context learning. Previously Machine Learning Researcher @ Merantix Momentum, Research Intern @ AWS.

Germany
Joined July 2022
Don't wanna be here? Send us removal request.
@julien_siems
Julien Siems
23 days
RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…
0
174
0
@julien_siems
Julien Siems
28 days
RT @FrankRHutter: #TabPFN v2 also excels on time series! Just before our Nature paper came out, we had this paper at the #NeurIPS time seri…
0
44
0
@julien_siems
Julien Siems
1 month
Thank you for having us! The video is here
@AutomlSeminar
AutoML Seminar
1 month
We’re kicking off the new year with a talk by Julien Siems and @riccardograzzi on significantly improving the performance of Linear RNNs through State-Tracking. Thursday 3pm CET (
0
0
6
@julien_siems
Julien Siems
2 months
RT @SonglinYang4: (1/10) Excited to share one of the most elegant works I’ve been working on: Parallelizing Linear Transformers with the De…
0
70
0
@julien_siems
Julien Siems
2 months
RT @SathyaKamesh98: We are elated to introduce our most recent work on time-series foundation models - Mamba4Cast: Efficient Zero-Shot Time…
0
2
0
@julien_siems
Julien Siems
3 months
RT @aimodelsfyi: wild how we spent decades thinking RNNs needed positive eigenvalues to be stable but it turns out letting them go negative…
0
1
0
@julien_siems
Julien Siems
3 months
RT @SonglinYang4: DeltaNet is the most elegant architecture I’ve experimented with. Kudos to @ImanolSchlag, @ Kazuki Irie, and @Schmidhuber
0
14
0
@julien_siems
Julien Siems
3 months
RT @ahatamiz1: DeltaNet is already a revolutionary model and you can even make it better by a simple trick; negative eigenvalues !!
0
1
0
@julien_siems
Julien Siems
3 months
Check out our recent paper showing how to improve the state-tracking performance of linear RNNs like Mamba or DeltaNet at no cost to training or inference!
@riccardograzzi
Riccardo Grazzi
3 months
LLMs can now track states, finally matching this cat! And we prove it. But how? 🧵👇 1/ Paper: with @julien_siems @jkhfranke @ZelaArber  @FrankRHutter   @MPontil
0
0
9
@julien_siems
Julien Siems
3 months
RT @fly51fly: [LG] Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues R Grazzi, J Siems, J K.H. Franke, A Zela... [Istitu…
0
2
0
@julien_siems
Julien Siems
3 months
RT @SonglinYang4: Improving DeltaNet's state tracking is as simple as this one-liner: beta = 2 * beta 🚀🤓
0
8
0
@julien_siems
Julien Siems
3 months
RT @gklambauer: Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Forget my earlier post, this is the cool one! :) Ana…
0
22
0
@julien_siems
Julien Siems
3 months
@Grad62304977 @_albertgu Interesting, what makes you feel like TBTT should be done differently for linear RNNs? I am guessing most papers aren't using TBTT for linear RNNs at the moment because it would make the comparison to transformers more difficult. I agree with your clarification about GLA, thanks!
1
0
2
@julien_siems
Julien Siems
3 months
@Grad62304977 @_albertgu Shida Wang wrote a nice paper on TBTT for Mamba models. Songlin Yang tried it for GLA but found no performance difference
1
0
1
@julien_siems
Julien Siems
6 months
RT @n_ajroldi: AlgoPerf leaderboards are out! 🎉 Amazing third place with and thanks to @orvieto_antonio, @jonasgeiping, @ELLISInst_Tue! 1…
0
3
0
@julien_siems
Julien Siems
7 months
Excited to be organizing the workshop with @ermgrant, @JelenaBratulic, @beyzaermi, @FrankRHutter, @noahholl
0
0
4
@julien_siems
Julien Siems
7 months
We have an exciting lineup of speakers: @akyurekekin (MIT), @scychan_brains(DeepMind), @SamuelMullr (Freiburg), @kazemi_sm (Google Research), @LizzieLyc (U Michigan)
0
0
0