Today marks 5 years since the public release of PyTorch! We didn't expect to come this far, but here we're🙂- 2K Contributors, 90K Projects, 3.9M lines of "import torch" on GitHub. More importantly, we're still receiving lots of love and having a great ride. Here's to the future!
We’re excited to announce support for GPU-accelerated PyTorch training on Mac! Now you can take advantage of Apple silicon GPUs to perform ML workflows like prototyping and fine-tuning. Learn more:
To help developers get started with PyTorch, we’re making the 'Deep Learning with PyTorch' book, written by Luca Antiga and Eli Stevens, available for free to the community:
The full version of the Deep Learning with PyTorch book from Luca Antiga, Eli Stevens, and Thomas Viehmann is now available! New chapters include in-depth real-world examples and production deployment. Grab a free digital copy on:
We’re excited to announce the release of PyTorch 2.0!
This version includes:
⚙️ 100% backward compatible
📦 Out of the box performance
📶 Significant speed improvements
Learn more 👇
Inside the Matrix: Use 3D to visualize matrix multiplication expressions, attention heads with real weights, and more. ⚡
Read our latest post on visualizing matrix multiplication, attention and beyond:
We just introduced PyTorch 2.0 at the
#PyTorchConference
, introducing torch.compile!
Available in the nightlies today, stable release Early March 2023.
Read the full post:
🧵below!
1/5
Want to make your inference code in PyTorch run faster? Here’s a quick thread on doing exactly that.
1. Replace () with the ✨torch.inference_mode()✨ context manager.
10x performance on LLaMa 7B, all in native PyTorch. No C++ needed.
Check out our second blog post in a series on Accelerating Generative AI using Native PyTorch. 🔥
We're excited to release the torch.fft module in PyTorch 1.8. This module implements the same functions as NumPy’s np.fft module, but with support for accelerators, like GPUs, and autograd.
Learn more 👉
Microsoft VSCode integrates deeply with PyTorch out of the box.
As
@aerinykim
highlights:
1. It shows values inside tensors. (Left panel)
2. By simply mousing over, you can see the variable's shape, weight, bias, dtype, device, etc.
Announcing the alpha release of torchtune!
torchtune is a PyTorch-native library for fine-tuning LLMs. It combines hackable memory-efficient fine-tuning recipes with integrations into your favorite tools.
Get started fine-tuning today!
Details:
Introducing torch.profiler! New PyTorch Profiler collects both GPU and framework related info, correlates them, performs automatic detection of bottlenecks in the model, generates recommendations on how to resolve these bottlenecks, and visualize.
Read 👉
PyTorch 1.8 is here!
Highlights include updates for compiler, code optimization, frontend APIs for scientific computing, large scale training for pipeline and model parallelism, and Mobile tutorials.
Blog👇
Learn PyTorch on GPUs for free via Google Colaboratory
each PyTorch tutorial now has a link to open it on Colab, where you can interactively execute and play with code.
#thanksgoogle
Introducing VISSL () - a library for reproducible, SOTA self-supervised learning for computer vision! Over 10 methods implemented, 60 pre-trained models, 15 benchmarks, and counting.
Understanding GPU Memory 1: Visualizing All Allocations over Time 👀
In part 1 of this series, we show how to use Memory Snapshot, the Memory Profiler, and the Reference Cycle Detector to debug out of memory errors and improve memory usage.
Read more:
PyTorch 2.2 is here 🎉
Featuring:
- SDPA support of FlashAttention-2
- New ahead-of-time extension of TorchInductor
- device_mesh, a new abstraction for initializing and representing ProcessGroups
- A standardized, configurable logging mechanism called TORCH_LOGS
PyTorch 1.9 is here! Highlights include improvements for:
- torch.linalg, torch.special, and Complex Autograd
- Mobile Interpreter
- TorchElastic
- The PyTorch RPC framework
- APIs for model inference deployment
- PyTorch Profiler
See full details👇
[v1.1.0] Official TensorBoard Support, Attributes, Dicts, Lists and User-defined types in JIT / TorchScript, Improved Distributed
Read more about the changes at
As always, get the install commands on
Happy 3rd birthday TensorFlow! We've come a long way since the first release in 2015 & TensorFlow wouldn't be the framework it is today without you. As we work on
#TensorFlow20
, look at all the features we've added over the years to make TensorFlow easier to use.
#HappyBirthdayTF
torchvision 0.3.0: segmentation, detection models, new datasets, C++/CUDA operators
Blog with link to tutorial, release notes:
Install commands have changed, use the selector on
Stochastic Weight Averaging (SWA) is a simple procedure that improves generalization in deep learning over Stochastic Gradient Descent (SGD). PyTorch 1.6 now includes SWA natively. Learn more from
@Pavel_Izmailov
,
@andrewgwils
and Vincent:
The full hands-on tutorials, "Building Recommender Systems with PyTorch", are now available. We show how to build deep learning recommendation system and resolve the associated interpretability, integrity, and privacy challenges. See:
PyTorch Lightning 1.0.0 is now available. This is the final stable API to train and deploy models at scale, without the boilerplate. Read more about this release below:
Disney uses PyTorch for animated character recognition and to speed up its video processing pipeline.
@DTCITechnology
engineers also contributed new features to the Torchvision domain library.
If you installed PyTorch-nightly on Linux between Dec. 25 and Dec. 30, uninstall it and torchtriton immediately and use the latest nightly binaries.
Read the security advisory here:
v1.6: native mixed-precision support from NVIDIA (~2x perf improvement), distributed perf improvements, new profiling tool for memory consumption, Microsoft commits to developing and maintaining Windows PyTorch.
Release Notes:
Blog:
Today, we made usability and content improvements to PyTorch Tutorials including additional categories, a new recipe format for quickly referencing common topics, sorting using tags, and an updated homepage. Learn more:
The team
@WadhwaniAI
has built a multi-task network that detects pest infestations in cotton crops. This technology is being put directly in the hands of more than 18,000 farmers across India using
#PyTorch
Mobile, TorchServe, and Weights & Biases.
We're standardizing OpenAI's deep learning framework on PyTorch to increase our research productivity at scale on GPUs (and have just released a PyTorch version of Spinning Up in Deep RL):
DGL (Deep Graph Library) - Clean and efficient library to build graph neural networks including GCN, TreeLSTM and graph generative models. Includes auto-batching and other tricks for speed.
v1.7: CUDA 11 supported with binaries on , updated profiling/performance for RPC, TorchScript and Stack traces in the autograd profiler, support for NumPy compatible FFT via torch.fft.
Release Notes:
Blog: 👇
Do you want to adapt an LLM on your own data and domain? 🤔
Learn how to finetune a 7B parameter model on a typical consumer GPU (NVIDIA T4 16GB) with LoRA and tools from the PyTorch and Hugging Face ecosystem in our latest post.
Details 👉
PyTorch 2.3 is here 😎🔥
PyTorch 2.3 offers support for user-defined Triton kernels in torch.compile, allowing for users to migrate their own Triton kernels from eager without experiencing performance regressions or graph breaks.
Details:
PyTorch 1.10 is here!
Highlights include updates for:
- CUDA Graphs APIs updates
- Several frontend APIs moved to Stable
- Automatic fusion in JIT Compiler support for CPU/GPUs
- Android NNAPI now in beta
Blog:
Release:
v1.5: autograd API for Hessians/Jacobians, C++ frontend stable and 100% parity with Python, Better performance on GPU and CPU with Tensor Format ‘channels last’, distributed.rpc stable, Custom C++ class binding
Release notes:
Blog:
Part 2️⃣: Understanding GPU Memory 🤔
In our latest post, we will use the Memory Snapshot to visualize a GPU memory leak caused by reference cycles, and then locate and remove them in our code using the Reference Cycle Detector.
Read more:
Announcing Flash-Decoding 🚀
Flash-Decoding makes LLM decoding much faster, and in particular allows to scale to very long sequence lengths (64k+) without slowdown!
Read more on our blog:
Today we’d like to highlight features from functorch, a beta PyTorch library that provides JAX-inspired function transformations like vmap. ()
If you’re not sure what sort of cool new things vmap allows you to do, read on to learn more!
(1/n)
Bolts is a new Deep Learning research and production toolbox from PyTorch Lightning. Iterate faster with pre-trained models, components, callbacks, and data sets, all modular, tested, and optimized for GPUs/TPUs.
Simply subclass, override, and train.
3x faster text-to-image diffusion models, all in pure PyTorch. No C++ needed.
Check out our third blog post in the series on Accelerating Generative AI using Native PyTorch. 🔥
MaskRCNN-Benchmark:
- A fast, modular reference of {Mask,Faster}RCNN
- by
@fvsmassa
(PyTorch), optimized by Nvidia
- reusable components, pre-trained models
- optimized inference, live demo
Hope to see mmdetection and other great projects reuse the code!
✨ Low Numerical Precision in PyTorch ✨
Most DL models are single-precision floats by default.
Lower numerical precision - while reasonably maintaining accuracy - reduces:
a) model size
b) memory required
c) power consumed
Thread about lower precision DL in PyTorch ->
1/11
Two creators of passion projects that transformed the landscape of how we code today — Linus Torvalds and
@soumithchintala
meet for the first time, sharing a smile and a love for the open source community.
#PyTorchFoundation
Together with
@msdev
, we’ve created a PyTorch “Learn the Basics” tutorial. Familiarize yourself with PyTorch concepts and modules. Learn how to load data, build deep neural networks, train and save your models in this quick-start guide.
Get started now:
@rstudio
introduces Torch for R, an R package that allows researchers to use PyTorch functionality natively from R. No Python installation is required since torch is built directly on top of libtorch. Learn more:
The Global PyTorch Summer Hackathon is back! This year, teams can compete in 3 categories:
1. Developer Tools
2. Web/Mobile applications
3. Responsible AI Development Tools
Read more at:
#PyTorchSummerHack
RoMa: an easy-to-to-use, stable and efficient library to deal with rotations and spatial transformations in PyTorch.
Read all about this PyTorch Ecosystem Tool in our latest Medium post ⚡
PyTorch BigGraph: a distributed system for learning large graph embeddings
- up to billions of entities and trillions of edges
- Sharding and Negative Sampling
- WikiData embeddings (78 mil entities, 4131 relations)
- Blog:
- Code:
Stochastic Weight Averaging: a simple procedure that improves generalization over SGD at no additional cost.
Can be used as a drop-in replacement for any other optimizer in PyTorch.
Read more:
guest blogpost by
@Pavel_Izmailov
and
@andrewgwils
Announcing PyTorch 2.1 Stable Release ⚡✨
PyTorch 2.1 offers automatic dynamic shape and NumPy compilation support in torch.compile, as well as torch.distributed.checkpoint for saving/loading distributed training jobs on multiple ranks.
Learn more 👇
NeMo,
@NVIDIA
’s open-source toolkit based on
#PyTorch
, allows you to quickly build, train, and fine-tune conversational AI models. See how speech recognition, natural language processing and speech synthesis can be improved in this tutorial:
Announcing ExecuTorch 🚀
ExecuTorch offers a compact runtime with a lightweight operator registry to cover the PyTorch ecosystem of models, and a streamlined path to execute PyTorch programs on edge devices.
Details:
Hydra provides an ability to compose and override configuration from the command line and config files, helping PyTorch developers to more easily manage complex ML projects.
[v0.4.1] Spectral Norm, Adaptive Softmax, faster CPU ops, anomaly detection (NaNs, etc.), Lots of bug fixes, Python 3.7 and CUDA 9.2 support and more. Full release notes at
As always, update via commands at or via PyPI
[v1.3.0] Named Tensors, iOS / Android support, Quantization, Type Promotion and more.
Also available: TPU device, Detectron2, Captum for model interpretability, CrypTen for PPML research etc.
Read more here: [Link to the PyTorch 1.3 blog]
TorchServe v0.1.1 is available with new features and support for HuggingFace BERT, Waveglow, Model Zoo, SnakeViz Profiler, AWS CloudFormation Template, and Automated Integration regression test suite. Read the release notes for details:
Torchmeta is a collection of extensions and data loaders for few-shot learning and meta-learning. It won first place at the Global PyTorch Summer Hackathon last year. Learn more in the blog post from Tristan Deleu, the project author:
PyTorch Hub: reducing the friction in reproducing and building-upon research
- Pull models with 1 line of code, and a few more to use
- Curated models. Open with Google Colab and
@paperswithcode
- Publish your models by sending a PR
Blog:
#ICML2019
Announcing PyTorch 1.11, TorchData, and functorch!
Highlights:
- TorchData, a new library for common modular data loading primitives
- functorch adds composable function transforms
- DDP static graph optimizations in stable
Learn more👇
Catalyst is a PyTorch framework designed for reproducibility, faster experimentation and code/ideas reusing.
This blog walks through MINST classifier tutorial and compare PyTorch and Catalyst code side-by-side:
The PyTorch team is excited to share that our paper on PyTorch 2 has been accepted for presentation at ASPLOS 2024!
The paper delves into the implementation of torch.compile and highlights the key technologies driving it, read it here 👇📝
Always amazed by what people do when you open-source your code!
Here is pytorch-bert v0.4.0 in which
- NVIDIA used their winning MLPerf competition techniques to make the model 4 times faster,
-
@rodgzilla
added a multiple-choice model & how to fine-tune it on SWAG
+ many others!
Introducing
@PyTorchLive
, an easy to use library of tools for creating on-device ML demos on Android and iOS. With Live, you can build a working mobile app ML demo in minutes. Learn more at and post your demo with
#PyTorchLive
.
PyTorch/XLA, a package that lets PyTorch connect to
@GCPcloud
TPUs and use TPU cores as devices, is now generally available. Highlights:
- Support for Intra-Layer Model Parallelism
- Additional XLA ops
- Integrations with Colab and Kaggle notebooks
Torch-TensorRT is now an official part of the PyTorch ecosystem and now available on PyTorch GitHub and Documentation. Torch-TensorRT is a TensorRT integration for PyTorch that accelerates inference up to 4x on NVIDIA GPUs with just a single line of code.
PyTorch Lightning V0.9 is available now featuring the final API with better data decoupling, shorter logging syntax, synchronized batchnorm, and more. Learn more:
The Computer Vision Recipes repository by researchers at Microsoft provides examples and best practice guidelines for building CV systems. It supports classification, retrieval, tracking, action recognition, and more. Learn:
Captum is a library for model interpretability. Its algorithms include integrated gradients, conductance, SmoothGrad and VarGrad, and DeepLift. Learn more:
PyTorch 1.9 extends PyTorch’s support for linear algebra operations with the torch.linalg module. The module has 26 operators, including every function from NumPy’s linear algebra module extended with accelerator and autograd support, and more. Read more:
Github Engineering experiments with Semantic Code Search: search for code snippets using natural language.
Built on top of fastai + PyTorch, it's fully open-source
Read about their approach here:
Online Demo:
This tutorial will introduce compute and data-efficient transformers and provide a step-by-step to create your own Vision Transformers. Through this guide, you'll be able to train state of the art results for classification in both computer vision & NLP.
Introducing “Learn the Basics" - a guide to a complete ML workflow with detailed explanations on concepts like Tensors, DataLoaders, Transforms, and more. Thanks
@sethjuarez
,
@subramen
,
@Cassieview
,
@shwars
,
@pythiccoder
for your contributions! Click👇
FX-based feature extraction is a new TorchVision utility that allows access to intermediate transformations of an input during the forward pass of a PyTorch Module.
Read more below on what makes TorchVision's utility more versatile than existing methods.
Introducing FlexAttention: a new API that lets you implement diverse attention variants in just a few lines of idiomatic PyTorch code. 🔥
Check out the blog post for more details:
Lyft Level 5 has adopted
#PyTorch
to build an internal framework for their ML efforts as they build self-driving technology. They reduced the median job training time for heavy production jobs such as 2D and 3D detectors and segmenters to just 1 hour. Read:
Our community member from Japan, Yutaro Ogawa, shares the PyTorch Tutorials in Japanese.
コミュニティメンバの小川さん(電通国際情報サービスISID AIトランスフォーメーションセンター)
@ISID_AI_team
より、日本語版PyTorchチュートリアルが公開されました
Tensor Comprehensions: einstein-notation like language transpiles to CUDA, and autotuned via evolutionary search to maximize perf.
Know nothing about GPU programming? Still write high-performance deep learning.
@PyTorch
integration coming in <3 weeks.
Try out
@NvidiaAI
's Vid2Vid project for photorealistic video-to-video translation, synthesizing label maps to realistic videos, people talking from edge maps, or generating human motions from poses
Now supports the latest PyTorch v0.4.1:
FairScale, a PyTorch extension for efficient large scale training, is releasing FullyShardedDataParallel, which shards model params across GPUs (+offload to CPU). Details: . Inspired by DeepSpeed/
@MSFTResearch
, and made by
@myleott
@m1nxu
@sam_shleifer