![Jean Kossaifi Profile](https://pbs.twimg.com/profile_images/797225476624416772/Gw3PV1qc_x96.jpg)
Jean Kossaifi
@JeanKossaifi
Followers
4K
Following
3K
Statuses
1K
Pushing boundaries of AI & making it available for all @nvidiaAI | ex @Samsung AI, @amazon AI, PhD @imperialcollege | TensorLy creator https://t.co/1epTxX3XFB 👨💻
San Francisco, CA
Joined December 2014
Such a pleasure to meet you in person, @KyleCranmer—looking forward to continuing our discussion! Really enjoyed giving this invited talk on Neural Operators, efficiency with tensor methods and open source implementations in - great audience with insightful questions and discussions. Huge thanks to @Grigoris_c for the invitation! First visit to Madison but definitely not the last —looking forward to more!
1
2
22
RT @Thom_Wolf: Finally took time to go over Dario's essay on DeepSeek and export control and to be honest it was quite painful to read. And…
0
520
0
RT @AnimaAnandkumar: An eventful day, but I am not personally worried for the following reasons: The new DeepSeek models show a number of…
0
23
0
We did a similar study of post-activation vs pre-act in the context of neural operators and designed an improved spectral convolution block based on that. It would be interesting to see how this mixed version works in FNO and other neural operators @AnimaAnandkumar @Azizzadenesheli
0
3
7
Thank you! It’s been a team effort and we’ve been working on this for a while. The actual library started along with our paper on multi-grid tensorized neural operators (, and kept working on it since. We plan to continue developing it to make the latest developments in neural operators as accessible as possible to all, regardless of the field of expertise or programming level.
0
0
1
@jeremyphoward I was looking forward to that as well - did you find the text to be as sharp? Just from seeing it in the shop, i was worried the text would be a little more blurry/less sharp, especially for coding.
2
0
0
RT @tydsh: Our Galore technique, i.e., using low-rank gradient/optimizer stats to update weights during training, can also be used to train…
0
12
0
As we scale deep neural operators to larger scale problems for scientific applications, it becomes increasingly important to efficiently capture complex, high-dimensional relationships. Tensor methods are key to achieving this in practice. This works uses a Tucker factorization of the (tensor) optimization parameters associated with the (tensor) weights of Neural Operators, achieving superior performance with reducing by up to 75% optimizer memory usage. It builds on the #TensorLy library to efficiently perform the tensor decomposition on the fly during training with warm restarts.
Excited to present our work on Tensor-GaLore at the Optimization for Machine Learning Workshop #NeurIPS2024! We present Tensor-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Many models, particularly those used in scientific computing and computer vision, employ tensor-parameterized layers to capture complex, high-dimensional relationships. However, these tensor structures lead to significant memory requirements during training. Our method addresses this memory challenge through low-rank subspace optimization using Tucker decomposition, overcoming limitations of previous approaches restricted to matrix-parameterized weights, including those operating on complex-valued data. We showcase its effectiveness on Fourier Neural Operators (FNOs), a class of models crucial for solving partial differential equations. Across various PDE tasks, we achieved performance gains ranging from 11% to 50% better generalization while reducing optimizer memory usage by up to 76%. These consistent improvements, coupled with substantial memory savings across AI for science, demonstrate Tensor-GaLore's potential. Come see our poster today/tomorrow at West Ballroom A from 3:00 to 4:00 pm. Paper (workshop version): Authors: @Robertljg, David Pitt, @jiawzhao, @JeanKossaifi, @ChengLuo_lc, @tydsh
1
2
25
RT @Ashiq_Rahman_s: Excited to present our work, CoDA-NO, on multi-physics systems! Join us at the first poster session of #NeurIPS2024 on…
0
5
0
RT @Robertljg: Very excited to present this work at #NeurIPS2024! Come view our poster on Wednesday the 11th from 11 am to 2 pm PST. Also,…
0
4
0
RT @AnimaAnandkumar: Excited to present our work on CoDA-NO at #NeurIPS2024 We develop a novel neural operator architecture designed to sol…
0
28
0