Julien Siems @julien_siems profile

Julien Siems

@julien_siems

Followers

187

Following

384

Statuses

50

PhD student advised by Frank Hutter working on in-context learning. Previously Machine Learning Researcher @ Merantix Momentum, Research Intern @ AWS.

Germany

Joined July 2022

Don't wanna be here? Send us removal request.

Julien Siems

@julien_siems

23 days

RT @SonglinYang4: I've created slides for those curious about the recent rapid progress in linear attention: from linear attention to Light…

0

174

0

Julien Siems

@julien_siems

28 days

RT @FrankRHutter: #TabPFN v2 also excels on time series! Just before our Nature paper came out, we had this paper at the #NeurIPS time seri…

0

44

0

Julien Siems

@julien_siems

1 month

Thank you for having us! The video is here

AutoML Seminar

@AutomlSeminar

1 month

We’re kicking off the new year with a talk by Julien Siems and @riccardograzzi on significantly improving the performance of Linear RNNs through State-Tracking. Thursday 3pm CET (

0

6

Julien Siems

@julien_siems

2 months

RT @SonglinYang4: (1/10) Excited to share one of the most elegant works I’ve been working on: Parallelizing Linear Transformers with the De…

0

70

0

Julien Siems

@julien_siems

2 months

RT @SathyaKamesh98: We are elated to introduce our most recent work on time-series foundation models - Mamba4Cast: Efficient Zero-Shot Time…

0

2

0

Julien Siems

@julien_siems

3 months

RT @aimodelsfyi: wild how we spent decades thinking RNNs needed positive eigenvalues to be stable but it turns out letting them go negative…

0

1

0

Julien Siems

@julien_siems

3 months

RT @SonglinYang4: DeltaNet is the most elegant architecture I’ve experimented with. Kudos to @ImanolSchlag, @ Kazuki Irie, and @Schmidhuber…

0

14

0

Julien Siems

@julien_siems

3 months

RT @ahatamiz1: DeltaNet is already a revolutionary model and you can even make it better by a simple trick; negative eigenvalues !!

0

1

0

Julien Siems

@julien_siems

3 months

Check out our recent paper showing how to improve the state-tracking performance of linear RNNs like Mamba or DeltaNet at no cost to training or inference!

Riccardo Grazzi

@riccardograzzi

3 months

LLMs can now track states, finally matching this cat! And we prove it. But how? 🧵👇 1/ Paper: with @julien_siems @jkhfranke @ZelaArber @FrankRHutter @MPontil

0

9

Julien Siems

@julien_siems

3 months

RT @fly51fly: [LG] Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues R Grazzi, J Siems, J K.H. Franke, A Zela... [Istitu…

0

2

0

Julien Siems

@julien_siems

3 months

RT @SonglinYang4: Improving DeltaNet's state tracking is as simple as this one-liner: beta = 2 * beta 🚀🤓

0

8

0

Julien Siems

@julien_siems

3 months

RT @gklambauer: Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Forget my earlier post, this is the cool one! :) Ana…

0

22

0

Julien Siems

@julien_siems

3 months

@Grad62304977 @_albertgu Interesting, what makes you feel like TBTT should be done differently for linear RNNs? I am guessing most papers aren't using TBTT for linear RNNs at the moment because it would make the comparison to transformers more difficult. I agree with your clarification about GLA, thanks!

1

0

2

Julien Siems

@julien_siems

3 months

@Grad62304977 @_albertgu Shida Wang wrote a nice paper on TBTT for Mamba models. Songlin Yang tried it for GLA but found no performance difference

1

0

1

Julien Siems

@julien_siems

6 months

RT @n_ajroldi: AlgoPerf leaderboards are out! 🎉 Amazing third place with and thanks to @orvieto_antonio, @jonasgeiping, @ELLISInst_Tue! 1…

0

3

0

Julien Siems

@julien_siems

7 months

Excited to be organizing the workshop with @ermgrant, @JelenaBratulic, @beyzaermi, @FrankRHutter, @noahholl

0

4

Julien Siems

@julien_siems

7 months

We have an exciting lineup of speakers: @akyurekekin (MIT), @scychan_brains(DeepMind), @SamuelMullr (Freiburg), @kazemi_sm (Google Research), @LizzieLyc (U Michigan)

0