Python, NumPy, Pandas, Matplotlib, Scikit-learn, and TensorFlow are FUNDAMENTAL skills for anyone learning data science and machine learning.
Master them tuning with these FREE resources.
Python, NumPy, Pandas, Matplotlib, Scikit-learn, and TensorFlow are FUNDAMENTAL data science and machine learning skills.
Master them with these FREE resources.
Python, NumPy, Pandas, Matplotlib, Scikit-learn, and TensorFlow are FUNDAMENTAL skills for anyone learning data science and machine learning.
Master them tuning with these FREE resources:
Python, NumPy, Pandas, Matplotlib, and Scikit-learn are FUNDAMENTAL skills for anyone learning data science and machine learning 🤖
Master them tuning with these FREE resources 🔥
A Thread 🧵👇
I started my career in data science and machine learning back in 2017 ⏳
I wrote a book on how to build a career as a technical writer in DS/ML✨
Today is my birthday, and I am giving the book for FREE 🎂
To receive:
• Like, comment "Send"
• Must be following so I can DM
4 amazing books for data science and machine learning 🤖
1️⃣ Building Machine Learning Powered Applications
2️⃣ Practical Machine Learning for Computer Vision
3️⃣ Building Machine Learning Pipelines
4️⃣ Desginig Machine Learing Sytems
Check them out 👇🏻
Fine-tuning a Llama 65B parameter model requires 780 GB of GPU memory.
This kind of compute is outside the purview of most individuals.
Thanks to parameter-efficient fine-tuning strategies, it is now possible to fine-tune a 7B parameter model on a single GPU, like the one
Detecting small objects with computer vision models is challenging.
This is because they occupy a few pixels in the entire image.
However, the SAHI technique makes it possible to detect small objects using various models such as YOLOv5.
SAHI is a generic slicing-aided
Kaggle is an amazing resource, particularly for someone learning data science and machine learning 🛠
Here's how I have used Kaggle to advance my career 🔥
Google recently released an alternative to NumPy that's way faster for numerical computation; JAX 👾
They also shipped a deep learning library for JAX called Flax 🤖
Master JAX and Flax with these 12 FREE resources 🔥
--A Thread--
🧵
Kaggle is an amazing resource, particularly for someone learning data science and machine learning.
Here's how you can use Kaggle to advance your career:
Training machine models locally is limited to the computation power of your computer.
Here are 6 alternatives for training ML models that will give you more computational resources, including GPUs for FREE:
Training NLP and CV models from scratch is a waste of resources.
Instead, apply transfer learning using pre-trained models.
Here's how transfer learning works in 6 steps.
--A Thread--
🧵
Training NLP and CV models from scratch is a waste of resources.
Instead, apply transfer learning using pre-trained models.
Here's how transfer learning works in 6 steps:
Kaggle is an amazing resource, particularly for someone learning data science and machine learning 🛠
Here's how I have used Kaggle to advance my career 🔥
--A Thread--
🧵
1/3
I have been writing data science and ML content for ~ 8 years now.
Technical writing can accelerate your career as an ML practitioner.
Here are 15 principles to follow when writing technical content.
I started my career in machine learning back in 2017 ⌛️
Started by learning TensorFlow 🤖
Master TensorFlow with these 11 FREE resources 🔥
--A Thread--
🧵
Training an ML model inside a Jupyter notebook is something every data scientist knows.
But do you know how to debug your code using IDEs?
If the answer is NO, this post is for you:
Tensors are a FUNDAMENTAL concept in machine learning.
Master them with this FREE resource, covering:
• What is a Tensor?
• How to create tensors
• Functions to create various Tensor objects
• How to create tensors with custom values
• How to initialize tensors with
Run a medical chatbot on CPUs, no GPUs required.
Here's how:
This video walks you through running a medical chatbot using open-source LLMs such as Llama on your computer without GPUs.
The application uses DeepSparse by
@neuralmagic
for accelerated inference on CPUs, LangChain,
Training an ML model inside a Jupyter notebook is something every data scientist knows 🏋️
But do you know how to debug your code using IDEs?
If the answer is NO, this thread is for you 🔥
🧵
1/6
How to deploy NLP and computer vision models on Google Cloud.
Pre-trained CV and NLP models offer good accuracy because they are trained on massive datasets.
Unfortunately, these models are very large, making them difficult and expensive to deploy. Due to their size, they
Training machine models locally is limited to the computation power of your computer.
Here are 6 alternatives for training ML models that will give your more computational resources, including GPUs:
The most common large language model evaluation metrics:
• Perplexity
• BLEU
• ROUGE
• BERTScore
• COMET
• METEOR
• BLEURT
• GPTScore
• PRISM
• BARTScore
• G-Eval
• Human Evaluation
Details below:
Perplexity measures how good a model is at predicting the
Google JAX is faster than NumPy for numerical computation.
Flax is the deep learning library built on top of JA.
Master JAX and Flax with these 12 FREE resources.
You are still deploying ML models on GPUs?
Well, you got some money to throw away.
Use CPUs for a 66% reduction in inference cost without loss of performance, in fact, with 2X better performance than a T4 GPU on a 4-core CPU laptop 🔥.
Here's how in 3 steps:
--A Thread--
🧵
I removed 50% of the weights from a top leaderboard LLM without negatively impacting the evals.
Using SparseML from
@neuralmagic
I was able to zero out 50% of the SOLAR-10.7B-Instruct-v1.0 weights.
I then quantized the remaining weights to INT8.
The results are amazing!
I started my career in data science and machine learning back in 2017 🕰
Learning multiple concepts and technologies is critical when starting your journey 🤖
To help you get started, I have curated various resources as PDFs 📑
Download them for free from the next tweet 🔥
Knowing how to DEPLOY models is almost mandatory to land a job in machine learning 🏗
Master how to deploy models on AWS with these 3 FREE resources 🔥
--A Thread--
🧵
Context length is one of the biggest problems with LLMs such as ChatGPT.
There is a limitation on the number of words in your prompt because the models can only accept a certain number of tokens.
The solution? Embeddings.
Word embedding is a technique used to represent
I wrote my first data science article in 2018.
Now written over 300 data science and ML articles.
I think you, too, should document your learnings.
If that sounds like something you’d like to pursue, I’d like to offer an ULTIMATE guide for doing so.
👇🏻
You use CPUs and GPUs to deploy ML models every day.
But have you ever considered how they work in machine learning?
Here's how they differ and why you should choose a CPU over a GPU for your next deployment.
--A Thread--
🧵
I started my career in data science and machine learning back in 2017 ⏳
Still amazes me that a $9 dollar data science course can change your life ✨
Here are the 3 courses I took when getting started🔥
--A Thread--
🧵
Porting YOLO to PyTorch from DarkNet was a game changer in the computer vision world.
YOLOv5 is arguably the most popular object detection model today, as a result.
However, deploying it for real-time inference will still require reducing its size.
There are 2 strategies for
Are you still deploying uncompressed ML models in 2023?
STOP.
Apply Gradual Magnitude Pruning (GMP), the current KING in ML model pruning, to reduce the size of large models by 20X without loss of accuracy.
Here's how to apply GMP in 5 steps.
--A Thread--
🧵
Smaller machine learning models are vital for deployment, particularly for real-time inference and on-edge devices.
Reducing the size of the model also leads to lower deployment costs.
There are 3 main techniques that you can use to reduce the size of an ML model. They are:
Training NLP and CV models from scratch is a waste of resources ❌
Instead, apply transfer learning using pre-trained models 🤖
Here's how transfer learning works in 6 steps 🪜
--A Thread--
🧵
The model you have on your laptop doesn't help anyone.
You have to deploy it for people to get value out of it.
Enter MLOps.
Here's everything you need to know about MLOps.
GPUs are becoming scarce.
Everyone is now training and deploying ML models 🦾
Deploy your ML models on CPUs with the same performance as a T4 GPU.
Here's how 🔥
--A Thread--
🧵
1/6
The model you have on your laptop 💻 doesn't help anyone.
You have to deploy it for people to get value out of it.
Enter MLOps.
Here's everything you need to know about MLOps.
--A Thread--
🧵
Machine learning projects can fail due to a myriad of reasons.
Here are the top 5 mistakes that lead to ML project failure and how to avoid them, according to Dr. Michael Lones.
I removed 50% of the weights from a top leaderboard LLM without negatively impacting the evals.
Using SparseML from
@neuralmagic
I was able to zero out 50% of the SOLAR-10.7B-Instruct-v1.0 weights.
I then quantized the remaining weights to INT8.
The results are amazing!
I have been writing data science and ML content for ~ 8 years now 🕧
Technical writing can accelerate your career as an ML practitioner 🤖
Here are 15 principles to follow when writing technical content ✍️
--A Thread--
🧵
I wrote my first data science article in 2018.
Now written over 300 data science and ML articles.
I think you, too, should document your learnings.
If that sounds like something you’d like to pursue, I’d like to offer an ULTIMATE guide for doing so.
There are various steps taken when building machine learning models.
For instance, data pre-processing, model development, and fine-tuning the model, to mention a few.
These steps generate some information such as:
• Model parameters
• Model versions
• Data set versions
1️⃣ Python for Data Science and Machine Learning Bootcamp
Learn how to use NumPy, Pandas, Seaborn , Matplotlib , Plotly , Scikit-Learn , Machine Learning, Tensorflow , and more!
Can one data science article change your life?
On the 18th of Feb 2018, I published an article that changed the trajectory of my career.
Here is the story:
What are logits in deep learning?
Logits are the outputs of a neural network before the activation function is applied. They are the unnormalized probabilities of the item belonging to a certain class.
Logits are often used in classification tasks, where the goal is to predict
Technical Writing 101 for machine learning professionals:
Over the past 5 years, I have written over 200 ML blog posts for machine learning companies.
Want to know the secret?
Here's the ultimate guide:
Accelerate your NLP pipelines with sparse transformers 🤖
You can get a 3X CPU performance increase by optimizing your models with only a few lines of code 🔥
--A Thread--
🧵
1/3
Object detection on a CPU using Supervision from
@roboflow
and DeepSparse from
@neuralmagic
.
Supervision provides the tools to reduce repetitive work in building computer vision applications, such as creating zones, annotating, and tracking objects.
DeepSparse provides
Building ML models is cool🤖.
But you are set to fail if you rush to algorithms too quickly.
Here are 5 things you should do before you start to build models🔥:
--A Thread--
🧵
O’Reilly has the best books in AI/machine learning.
Here are 4 books to include on your shelf if you are interested in building machine learning powered applications.
Tom is one of 263 Kaggle Competition Grandmasters.
He has participated in over 40 competitions on Kaggle.
He reveals how he got started and the process he uses when competing on Kaggle.
Tom, as he is popularly known on Kaggle, also reveals what you need to do to earn the
You are still deploying ML models on GPUs?
Well, you got some money to throw away.
Use CPUs for a 66% reduction in inference cost without loss of performance, in fact, with 2X better performance than a T4 GPU on a 4-core CPU laptop 🔥.
Here's how in 3 steps:
--A Thread--
🧵
Deploying large language models on CPU is now a viable option.
This is a result of the work of researchers from
@neuralmagic
and
@ISTAustria
.
For example, the demo in the video below is running on CPU on
@huggingface
Space.
In their latest paper, they successfully applied
We sparse fine-tuned Llama 2 7B to run on CPUs only, no GPU.
Here are the technical details (and demo):
In sparse-fine tuning Llama 2, we focused on the GSM8k dataset like in the MPT setup.
Llama 2 achieves 0% zero-shot accuracy on this task without any fine-tuning.
Machine learning is one of the hottest topics of the past decade.
Needless to say that it is being applied in all industries.
Follow me
@themwiti
for daily ML content:
• Technical deep dives
• Resources
• MLOps
• Tools
• NLP
• CV
Here is some of my best work:
Building machine learning models is an experimental process that requires several iterations.
You change different model parameters or data preprocessing steps at each iteration to obtain an optimal model.
It is vital to keep track of the processing steps and the parameters at
Everything you use today is powered by machine learning🚀.
Even this tweet was recommended to you by an ML algorithm🔥.
But do you know where ML came from?
Here's the story of how ML came into the world 🌎.
--A Thread--
🧵
Are you still deploying uncompressed ML models in 2023?
STOP.
Apply Gradual Magnitude Pruning (GMP), the current KING in ML model pruning, to reduce the size of large models by 20X without loss of accuracy.
Here's how to apply GMP in 5 steps 🔥
--A Thread--
🧵
Optimizing ML models for deployment is not an option ❌
Particularly for enterprises that want to lower their computing costs while improving production performance 💸
Consider these three model optimization techniques before deploying your next model 🦾
--A Thread--
🧵
1/3
Technical writing is a massive asset for any machine learning professional.
I have written over 200 articles in the last 6 years.
Here's how to leverage writing to land jobs, even if you have never written a single blog post before:
1️⃣ Python for Data Science and Machine Learning Bootcamp
Learn how to use NumPy, Pandas, Seaborn , Matplotlib , Plotly , Scikit-Learn , Machine Learning, Tensorflow , and more!