![Julien Siems Profile](https://pbs.twimg.com/profile_images/1747920030002561024/Mjc4x5c9_x96.jpg)
Julien Siems
@julien_siems
Followers
187
Following
384
Statuses
50
PhD student advised by Frank Hutter working on in-context learning. Previously Machine Learning Researcher @ Merantix Momentum, Research Intern @ AWS.
Germany
Joined July 2022
RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…
0
174
0
RT @FrankRHutter: #TabPFN v2 also excels on time series! Just before our Nature paper came out, we had this paper at the #NeurIPS time seri…
0
44
0
Thank you for having us! The video is here
We’re kicking off the new year with a talk by Julien Siems and @riccardograzzi on significantly improving the performance of Linear RNNs through State-Tracking. Thursday 3pm CET (
0
0
6
RT @SonglinYang4: (1/10) Excited to share one of the most elegant works I’ve been working on: Parallelizing Linear Transformers with the De…
0
70
0
RT @SathyaKamesh98: We are elated to introduce our most recent work on time-series foundation models - Mamba4Cast: Efficient Zero-Shot Time…
0
2
0
RT @aimodelsfyi: wild how we spent decades thinking RNNs needed positive eigenvalues to be stable but it turns out letting them go negative…
0
1
0
RT @SonglinYang4: DeltaNet is the most elegant architecture I’ve experimented with. Kudos to @ImanolSchlag, @ Kazuki Irie, and @Schmidhuber…
0
14
0
RT @ahatamiz1: DeltaNet is already a revolutionary model and you can even make it better by a simple trick; negative eigenvalues !!
0
1
0
Check out our recent paper showing how to improve the state-tracking performance of linear RNNs like Mamba or DeltaNet at no cost to training or inference!
LLMs can now track states, finally matching this cat! And we prove it. But how? 🧵👇 1/ Paper: with @julien_siems @jkhfranke @ZelaArber @FrankRHutter @MPontil
0
0
9
RT @fly51fly: [LG] Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues R Grazzi, J Siems, J K.H. Franke, A Zela... [Istitu…
0
2
0
RT @SonglinYang4: Improving DeltaNet's state tracking is as simple as this one-liner: beta = 2 * beta 🚀🤓
0
8
0
RT @gklambauer: Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Forget my earlier post, this is the cool one! :) Ana…
0
22
0
@Grad62304977 @_albertgu Interesting, what makes you feel like TBTT should be done differently for linear RNNs? I am guessing most papers aren't using TBTT for linear RNNs at the moment because it would make the comparison to transformers more difficult. I agree with your clarification about GLA, thanks!
1
0
2
@Grad62304977 @_albertgu Shida Wang wrote a nice paper on TBTT for Mamba models. Songlin Yang tried it for GLA but found no performance difference
1
0
1
RT @n_ajroldi: AlgoPerf leaderboards are out! 🎉 Amazing third place with and thanks to @orvieto_antonio, @jonasgeiping, @ELLISInst_Tue! 1…
0
3
0
Excited to be organizing the workshop with @ermgrant, @JelenaBratulic, @beyzaermi, @FrankRHutter, @noahholl
0
0
4
We have an exciting lineup of speakers: @akyurekekin (MIT), @scychan_brains(DeepMind), @SamuelMullr (Freiburg), @kazemi_sm (Google Research), @LizzieLyc (U Michigan)
0
0
0