Jean Kossaifi @JeanKossaifi profile

Jean Kossaifi

@JeanKossaifi

Followers

4K

Following

3K

Statuses

1K

Pushing boundaries of AI & making it available for all @nvidiaAI | ex @Samsung AI, @amazon AI, PhD @imperialcollege | TensorLy creator https://t.co/1epTxX3XFB 👨‍💻

San Francisco, CA

Joined December 2014

Don't wanna be here? Send us removal request.

Jean Kossaifi

@JeanKossaifi

5 days

Such a pleasure to meet you in person, @KyleCranmer—looking forward to continuing our discussion! Really enjoyed giving this invited talk on Neural Operators, efficiency with tensor methods and open source implementations in - great audience with insightful questions and discussions. Huge thanks to @Grigoris_c for the invitation! First visit to Madison but definitely not the last —looking forward to more!

Kyle Cranmer

@KyleCranmer

5 days

I finally met @JeanKossaifi and enjoyed his talk on Neural Operators @datascience_uw

1

2

22

Jean Kossaifi

@JeanKossaifi

12 days

RT @Thom_Wolf: Finally took time to go over Dario's essay on DeepSeek and export control and to be honest it was quite painful to read. And…

0

520

0

Jean Kossaifi

@JeanKossaifi

15 days

RT @AnimaAnandkumar: An eventful day, but I am not personally worried for the following reasons: The new DeepSeek models show a number of…

0

23

0

Jean Kossaifi

@JeanKossaifi

2 months

We did a similar study of post-activation vs pre-act in the context of neural operators and designed an improved spectral convolution block based on that. It would be interesting to see how this mixed version works in FNO and other neural operators @AnimaAnandkumar @Azizzadenesheli

0

3

7

Jean Kossaifi

@JeanKossaifi

2 months

Thank you! It’s been a team effort and we’ve been working on this for a while. The actual library started along with our paper on multi-grid tensorized neural operators (, and kept working on it since. We plan to continue developing it to make the latest developments in neural operators as accessible as possible to all, regardless of the field of expertise or programming level.

0

1

Jean Kossaifi

@JeanKossaifi

2 months

@jeremyphoward I was looking forward to that as well - did you find the text to be as sharp? Just from seeing it in the shop, i was worried the text would be a little more blurry/less sharp, especially for coding.

2

0

Jean Kossaifi

@JeanKossaifi

2 months

🎉 Get Started Today •Explore examples, models, and docs: •Fork and star our repository on GitHub: We welcome feedback and contributions, please open an issue or a pull-request on Github! Checkout the release notes for more:

0

4

36

Jean Kossaifi

@JeanKossaifi

2 months

RT @tydsh: Our Galore technique, i.e., using low-rank gradient/optimizer stats to update weights during training, can also be used to train…

0

12

0

Jean Kossaifi

@JeanKossaifi

2 months

As we scale deep neural operators to larger scale problems for scientific applications, it becomes increasingly important to efficiently capture complex, high-dimensional relationships. Tensor methods are key to achieving this in practice. This works uses a Tucker factorization of the (tensor) optimization parameters associated with the (tensor) weights of Neural Operators, achieving superior performance with reducing by up to 75% optimizer memory usage. It builds on the #TensorLy library to efficiently perform the tensor decomposition on the fly during training with warm restarts.

Prof. Anima Anandkumar

@AnimaAnandkumar

2 months

Excited to present our work on Tensor-GaLore at the Optimization for Machine Learning Workshop #NeurIPS2024! We present Tensor-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Many models, particularly those used in scientific computing and computer vision, employ tensor-parameterized layers to capture complex, high-dimensional relationships. However, these tensor structures lead to significant memory requirements during training. Our method addresses this memory challenge through low-rank subspace optimization using Tucker decomposition, overcoming limitations of previous approaches restricted to matrix-parameterized weights, including those operating on complex-valued data. We showcase its effectiveness on Fourier Neural Operators (FNOs), a class of models crucial for solving partial differential equations. Across various PDE tasks, we achieved performance gains ranging from 11% to 50% better generalization while reducing optimizer memory usage by up to 76%. These consistent improvements, coupled with substantial memory savings across AI for science, demonstrate Tensor-GaLore's potential. Come see our poster today/tomorrow at West Ballroom A from 3:00 to 4:00 pm. Paper (workshop version): Authors: @Robertljg, David Pitt, @jiawzhao, @JeanKossaifi, @ChengLuo_lc, @tydsh

1

2

25

Jean Kossaifi

@JeanKossaifi

2 months

@julberner @ArashVahdat Congrats, welcome aboard!

0

1

Jean Kossaifi

@JeanKossaifi

2 months

RT @Ashiq_Rahman_s: Excited to present our work, CoDA-NO, on multi-physics systems! Join us at the first poster session of #NeurIPS2024 on…

0

5

0

Jean Kossaifi

@JeanKossaifi

2 months

RT @Robertljg: Very excited to present this work at #NeurIPS2024! Come view our poster on Wednesday the 11th from 11 am to 2 pm PST. Also,…

0

4

0

Jean Kossaifi

@JeanKossaifi

2 months

RT @AnimaAnandkumar: Excited to present our work on CoDA-NO at #NeurIPS2024 We develop a novel neural operator architecture designed to sol…

0

28

0