skalskip92 Profile Banner
SkalskiP Profile
SkalskiP

@skalskip92

Followers
30K
Following
15K
Media
2K
Statuses
8K

Open-source Lead @roboflow. VLMs. GPU poor. Dog person. Coffee addict. Dyslexic. | GH: https://t.co/dEmzMDGq5H | HF: https://t.co/4Lx1Yw34W7

Kraków, Polska
Joined February 2014
Don't wanna be here? Send us removal request.
@skalskip92
SkalskiP
7 months
football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:
147
984
8K
@skalskip92
SkalskiP
8 months
polish TV is using computer vision to enhance the viewer experience for sports broadcasts:. - FIFA-like radar overlays.- player recognition.- pass distance measurement.- ball speed and trajectory tracking during shots
195
1K
14K
@skalskip92
SkalskiP
5 months
supervision, the open-source library I created a year ago, has crossed 20,000 stars on GitHub this weekend!. thank you to everyone who helped me build this project!. it took us 3,500+ commits, 850+ PRs and 80+ contributors to do it. repository:
63
738
7K
@skalskip92
SkalskiP
8 months
supervision - a computer vision library I created - just crossed 15,000 stars on GitHub!. BBBRRRRRR!. link:
48
596
6K
@skalskip92
SkalskiP
4 months
I used to hate working on projects like this. crazy how the word has changed over the past year; labeling is dead!
48
362
5K
@skalskip92
SkalskiP
1 year
RIP image annotation companies. Fully automated image labeling with GroundingDINO + SAM + OpenAI Vision API. code:
Tweet media one
64
390
3K
@skalskip92
SkalskiP
1 year
supervision-0.13.0 is out! Now you can effortlessly build advanced video analytics. Trackers, Zones, Annotators, and much more. GitHub repository:
35
532
3K
@skalskip92
SkalskiP
1 year
here is the final version of my vehicle speed estimation demo. read the thread below to learn how I built it. I will cover: .- detection .- tracking .- perspective transformation .- speed calculation .- some bonus ideas. ↓
70
279
3K
@skalskip92
SkalskiP
1 year
REAL-TIME object detection WITHOUT TRAINING. YOLO-World is a new SOTA open-vocabulary object detector that outperforms previous models in terms of both accuracy and speed. 35.4 AP with 52.0 FPS on V100. ↓ read more
33
377
3K
@skalskip92
SkalskiP
5 months
this might be the coolest-looking football AI visualization I ever created
@skalskip92
SkalskiP
7 months
football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:
48
204
2K
@skalskip92
SkalskiP
11 months
supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! . thank you to everyone who helped me build this project!. it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:
21
298
2K
@skalskip92
SkalskiP
8 months
Apple released 4M-21 last week -any-to-any vision-language model.(it almost flew under my radar because of CVPR). Apache-2.0 !!!. - image captioning.- depth estimation.- object detection.- instance segmentation.- image generation.- and much more, all in one modal. ↓ read more
Tweet media one
17
328
2K
@skalskip92
SkalskiP
9 months
almost fully functional version of my football AI project. today, I added player tracking using ByteTrack and projection of players onto the map. code coming soon:
50
206
2K
@skalskip92
SkalskiP
7 months
supervision-0.22.0 is coming out today. one of the things we release is Mediapipe integration along with default visualizers for face and body pose keypoints. link:
23
262
2K
@skalskip92
SkalskiP
2 years
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
44
304
2K
@skalskip92
SkalskiP
11 days
Alibaba dropped QWEN2.5VL yesterday; I spend all night working on fine-tuning tutorial. this notebook covers.- dataset preparation.- tokenization.- training with LoRA/QLoRA (for max performance on low-power devices).- fine-tuned model evaluation. link:
Tweet media one
24
259
2K
@skalskip92
SkalskiP
3 months
this guy took my Football AI project and used it to build a 3D representation of the game; CRAAAAAZY!.
@skalskip92
SkalskiP
7 months
football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:
29
161
2K
@skalskip92
SkalskiP
7 months
lots of you asked if soccer AI can detect the ball; I just added this functionality. code:
@skalskip92
SkalskiP
7 months
football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:
47
164
2K
@skalskip92
SkalskiP
6 months
FLUX inpainting. I know I'm late to the FLUX party, but I hope it's still okey if I drop something cool.
44
194
2K
@skalskip92
SkalskiP
1 year
I'm starting to get more and more serious with YOLO-World; trying to solve real-life problems. I wanted to see if YOLO-World could recognize that the holes had been filled out. It was pretty tricky, but I learned a little about prompting. ↓ read more
16
164
1K
@skalskip92
SkalskiP
1 year
The traffic analysis project is growing! The YouTube tutorial will be out this week. Progress: I can now identify that the car is in a specified zone. Next: Match entrance and exit zones for every tracker ID to analyze the traffic flow. GitHub repo:
20
287
1K
@skalskip92
SkalskiP
1 year
Chat with the webcam using @OpenAI vision API
45
176
1K
@skalskip92
SkalskiP
7 months
using GPT-4o to clean up my dataset automatically
Tweet media one
27
50
1K
@skalskip92
SkalskiP
9 months
I'm taking my football/soccer project to the next level. today, I worked on detecting players, referees, and the ball and mapping their positions from video frames to positions on the field. ↓ read more
58
115
1K
@skalskip92
SkalskiP
7 months
no more new VLMs? . I'm finally working on a YouTube tutorial for my football AI project; the tutorial should be out next week. stay tuned:
29
133
1K
@skalskip92
SkalskiP
6 months
over 200 hours of work compressed into a 90-minute video. the football AI tutorial is finally out!. link to video: ↓ key takeaways
32
159
1K
@skalskip92
SkalskiP
5 months
analyzing customer behavior in retail space. great project done by Abdelmouhaimen Sarhane - taking my supervision tutorials to the next level. supervision:
26
181
1K
@skalskip92
SkalskiP
14 days
.@Arsenal is looking for people to help them build football Al
Tweet media one
@skalskip92
SkalskiP
7 months
football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:
30
141
1K
@skalskip92
SkalskiP
3 months
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware. check out this SAM2 vs SAMURAI comparison!. - paper: - code: - license: Apache-2.0
17
183
1K
@skalskip92
SkalskiP
9 months
I fine-tuned my first vision-language model. PaliGemma is an open-source VLM released by @GoogleAI last week. I fine-tuned it to detect bone fractures in X-ray images. thanks to @mervenoyann and @__kolesnikov__ for all the help!. ↓ read more
Tweet media one
30
193
1K
@skalskip92
SkalskiP
3 months
tomorrow, I'll be hosting a talk for @MIT; I'll be speaking about open-source computer vision tools. 12:00 PM PST / 03:00 PM EST / 09:00 PM CET. we'll be streaming on X:
15
126
1K
@skalskip92
SkalskiP
1 year
ball and player 3d pose estimation - easily one of the coolest computer vision projects I have ever made. repository:
23
183
1K
@skalskip92
SkalskiP
11 months
detecting AI-generated text. researchers studied the impact of ChatGPT on AI conference peer reviews, confirming what we all knew. paper: ↓ read more
Tweet media one
32
114
1K
@skalskip92
SkalskiP
1 year
Nov 6th, 2023: We love you guys!.Nov 17th, 2023: Sam is fired!
40
157
1K
@skalskip92
SkalskiP
11 months
manual data labeling is (almost) dead. 1,500,000 images auto-annotated within 2 weeks of release. now, we also support automatic segmentation labeling. ↓ read more about open-source models that power this feature
51
136
1K
@skalskip92
SkalskiP
4 months
supervision-0.24.0 is out! you can finally count per-class line crossings. many of you have been asking for this, now we have it! . it took me barely 30 minutes to make this demo using supervision!. link:
11
141
1K
@skalskip92
SkalskiP
1 year
YOLOv9 is out. looks like a new SOTA real-time object detector. I'm already working on a custom training tutorial
@_akhaliq
AK
1 year
YOLOv9. Learning What You Want to Learn Using Programmable Gradient Information. Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth. Meanwhile, an appropriate
Tweet media one
24
159
1K
@skalskip92
SkalskiP
9 months
I need to take a break from football AI for a while. I plan to experiment with PaliGamma, Google's new open-source VLM, over the next few days. but don't worry, I'll be back. In the meantime, the football AI code is slowly making its way to this repo.
37
119
1K
@skalskip92
SkalskiP
1 year
train YOLOv9 on your dataset tutorial. - run inference with a pre-trained COCO model.- fine-tune model on custom dataset .- evaluate the trained model.- run inference with a fine-tuned model. blogpost: ↓ read more
14
155
1K
@skalskip92
SkalskiP
1 year
supervision-0.13.0 is out! Now you can effortlessly count crops in the fields with a single drone flyby. GitHub repository:
10
150
993
@skalskip92
SkalskiP
2 months
in January 2025, I'm launching a new series on my YouTube channel - VLMs zero-to-hero. honestly, I know shit about VLMs, and I want this series to change that. I've selected (for now) 19 papers that I plan to tell you about. link:
Tweet media one
21
138
995
@skalskip92
SkalskiP
9 months
taking my football/soccer AI to the next level. - image embeddings.- dimension reduction.- player clustering.- awesome visualizations. code: (code migration in progress. ). ↓ read more
31
90
956
@skalskip92
SkalskiP
1 year
what stops you from using supervision today? . link:
23
105
919
@skalskip92
SkalskiP
1 year
looking for OpenAI-4V alternatives?. - LLaVA.- BakLLaVA.- CogVLM.- Fuyu-8B.- Qwen-VL. I am working on a short blog post discussing some GPT-4V alternatives. It will probably come out today. links all resources:
Tweet media one
@skalskip92
SkalskiP
1 year
What OpenAI-4V alternatives would you recommend?.- LLaVA.- BakLLaVA.
42
148
912
@skalskip92
SkalskiP
6 months
Florence-2 + SAM-2. SAM-2 doesn't understand language on its own, but Florence-2 does. I'm having a lot of fun with this combo! The first version of my @huggingface space is already online. link:
16
139
917
@skalskip92
SkalskiP
4 months
awesome job by Eric Fenaux building on my football AI project. using signal processing, he managed to drastically improve the quality of ball tracking (yellow dot). I learned so much from his notebook: ↓ read more
11
104
925
@skalskip92
SkalskiP
7 months
Florence-2 fine-tuning YouTube tutorial is finally out! (sorry it took me so long). - running the pre-trained model with different vision tasks.- configuring LoRA.- training and benchmarking.- Florence-2 vs. top vision model. link: ↓ key takeaways
16
127
911
@skalskip92
SkalskiP
4 months
would you watch a stream where I show how to build analytics system like this with supervision?. - detection filtering with polygon zones.- object tracking.- customizable annotators.- line zones with in/out counters.- per-class counts. supervision repo:
@skalskip92
SkalskiP
4 months
supervision-0.24.0 is out! you can finally count per-class line crossings. many of you have been asking for this, now we have it! . it took me barely 30 minutes to make this demo using supervision!. link:
47
72
913
@skalskip92
SkalskiP
1 year
Automated @NBA match commentary using @OpenAI vision and TTS (with code!). Everyone is bragging about projects that generate automatic video commentary, but no one is showing the code. I did it while waiting for the plane. code:
42
136
903
@skalskip92
SkalskiP
11 months
manual data labeling is almost dead . define prompts, tweak the confidence threshold, and make manual adjustments if necessary. this feature is now available to all users, even on free accounts. read more:
14
123
906
@skalskip92
SkalskiP
6 months
SAM2 can be used for ReID (reidentification) across multiple camera views. top video - reference video; bottom two videos - new previously unseen camera angles. I only annotated 3 frames from the reference video
22
109
895
@skalskip92
SkalskiP
8 months
I spent most of today preparing for CVPR 2024. "Matching Anything by Segmenting Anything" particularly caught my attention. Here are the fast open-vocabulary tracking examples (MASA + YOLO-World). link: ↓ read more
8
127
887
@skalskip92
SkalskiP
8 months
Florence-2 is finally out! 1 model; 10+ computer vision tasks!. ↓ key takeaways are listed below. see my blog post for details. link:
22
122
869
@skalskip92
SkalskiP
6 months
Florence2 + SAM2 + FLUX.1 - prototype. - Florence2 - open vocabulary detection.- SAM2 - box to mask.- FLUX.1 - inpainting. I have tried to build it after work for the past 3 days. getting closer
21
123
857
@skalskip92
SkalskiP
9 days
2+ years of making computer vision tutorials. YOLOv11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, Qwen2.5-VL, and dozens of other models. "resource-intense place for people who want to get more in-depth into computer vision". link:
Tweet media one
15
143
1K
@skalskip92
SkalskiP
1 year
how to calculate the TIME objects spend IN THE ZONE? - that's the topic of my next tutorial. here's a short (and a bit creepy) demo I built a few months ago. do you have ideas for a less creepy use case for this tech?. github repository:
52
125
842
@skalskip92
SkalskiP
8 months
supervision 0.21.0 is launching tomorrow. this update includes VertexLabelAnnotator, allowing you to annotate skeleton vertices with custom text and color. link:
15
123
837
@skalskip92
SkalskiP
11 months
taking traffic analysis to the next level with supervision-0.19.0. speed estimation + 3d roead visualization. link: ↓ read more
12
107
846
@skalskip92
SkalskiP
22 days
latest transformers release added support for pose estimation with ViTPose and ViTPose++ (Apache 2.0 license). upcoming supervision release will add full support for ViTPose and ViTPose++, along with useful utilities for keypoint detection models.
19
108
850
@skalskip92
SkalskiP
6 months
ball tracking tutorial. after a short break due to the SAM2 release, I'm back working on football/soccer AI. I just published a very simple ball tracking tutorial; I'm curious what you think. link: ↓ key takeaways
21
81
843
@skalskip92
SkalskiP
5 months
my new open-source project. anyone who has played around with PaliGemma, Florence-2 or Qwen-VL knows that the barrier to entry is HIGH. for the past 2-3 weeks, I've been working on a new open-source project that aims to close that gap. link:
Tweet media one
18
143
834
@skalskip92
SkalskiP
1 year
analyzing store traffic to find the most frequently visited areas. super demo created by @Hine__Po - member of Supervision community. link to repo if you want to build something over the weekend:
13
145
801
@skalskip92
SkalskiP
1 year
The YOLO-World YouTube tutorial is out! . please, let us know what you think!. - model architecture .- processing images and video in Colab .- prompt engineering and detection refinement .- pros and cons of the model. watch here: ↓ more resources
11
132
799
@skalskip92
SkalskiP
6 months
new tutorial: how to use SAM2 for video segmentation. - load SAM2 for video processing.- data preprocessing.- segment and track one object.- refine predictions.- propagate prompts across video.- segment and track multiple objects.- limitations. link: ↓
18
117
797
@skalskip92
SkalskiP
4 months
I managed to fine-tune @OpenAI GPT-4o for object-detection task!!!. here's a veeeery dirty colab:
Tweet media one
21
68
801
@skalskip92
SkalskiP
11 months
now you can run real-time object detection on multiple streams with 10 lines of code. link: ↓ code snippet
13
137
780
@skalskip92
SkalskiP
5 months
#1 on GitHub for the first time!. so close to 20k stars! BRRRRRRRRRRRR!. link:
Tweet media one
21
45
754
@skalskip92
SkalskiP
6 months
perspective transformation tutorial. I know many of you have been waiting for this tutorial for a long time, and it's finally here!. link: ↓ key takeaways
10
93
758
@skalskip92
SkalskiP
10 months
it took us a while, but the supervision-0.20.0 release will finally add support for key points. what are your thoughts on annotators? so far, we only have EdgeAnnotator and VertexAnnotator. supervision repo:
21
93
729
@skalskip92
SkalskiP
11 months
YOLOv9 tutorial: train model on custom dataset. - running inference with pre-trained COCO weights .- fine-tuning the model on a custom dataset .- model evaluation .- model deployment. sorry it took me so long; hope you like it.
15
97
743
@skalskip92
SkalskiP
2 months
PaliGemma2 for image to JSON data extraction. - used google/paligemma2-3b-pt-336 checkpoint; I tried to make it happen with 224, but 336 performed a lot better.- trained on A100 with 40GB VRAM.- trained with LoRA. colab with complete fine-tuning code:
Tweet media one
Tweet media two
20
93
749
@skalskip92
SkalskiP
1 year
supervision-0.15.0 is out! This time, we bring highly customizable annotators. We added eight annotators - box, mask, ellipse, label, circle, corner, trace, and blur. But the best part is. you can freely mix them!. GitHub repository:
9
122
720
@skalskip92
SkalskiP
6 months
some ball trajectory analysis for my upcoming YouTube football AI tutorial. stay tuned; I'm dropping the tutorial this week:
11
67
737
@skalskip92
SkalskiP
9 months
YOLO is the craziest model family. Each version is created by a different organization. "Compared with YOLOv9-C, YOLOv10-B has 46% less latency and 25% fewer parameters for the same performance.". I'll try to test it today. ↓ links
Tweet media one
11
81
728
@skalskip92
SkalskiP
1 year
improving object counting logic. today I solved an interesting bug that has existed in my library for a loooooong time. repository: ↓ WARNING: lots of math in the thread below
7
78
731
@skalskip92
SkalskiP
1 year
Easily one of the most exciting projects built with Supervision!. Our community member Vriza Wahyu Saputra built this fantastic ball juggling counting demo using the moving LineZone available in our API.
12
90
698
@skalskip92
SkalskiP
1 year
Am I the last person who didn't know about OpenAI Cookbook?. link:
Tweet media one
23
85
695
@skalskip92
SkalskiP
1 year
parking occupancy analysis. calculation of percentage occupancy in individual parking zones. all this was done with supervision: btw, @UenoLeo is cooking a blog post covering this project, so stay tuned!. ↓ read more
13
94
692
@skalskip92
SkalskiP
10 months
support for pose estimation and key point detection soon in the supervision. you can expect connectors for the most popular models and the first annotators in the next supervision release. can't wait to build demos like this with supervision
13
80
693
@skalskip92
SkalskiP
11 months
I love watching other people build cool demos with the supervision library; traffic analysis examples built by Anant Jaiswal. - object tracking.- zone counting.- heat-map analysis. link:
4
90
692
@skalskip92
SkalskiP
11 months
smart self-service checkout powered by YOLOv9. the value of the basket is updated live based on its changing content; what else should I add?. demo build with supervision:
14
84
694
@skalskip92
SkalskiP
1 year
What papers should I read to expand my knowledge of Transformers?. Please send links in the comments and write why this paper is worth reading. Thanks for your help!
Tweet media one
32
99
670
@skalskip92
SkalskiP
3 months
can't wait to spend some of this money on open-source! .
Tweet media one
43
31
685
@skalskip92
SkalskiP
1 year
speed estimation tutorial is finally out!. - object detection.-multi-object tracking.- filtering detections with polygon zone.- perspective transformation and speed estimation. link: below are some interesting visualizations I created for this video. ↓
13
111
666
@skalskip92
SkalskiP
10 months
new YouTube tutorial: compute dwell time using computer vision in live streams. (seems easy, yet tricky). - static file vs stream processing.- preventing growing latency and frame buffer overflow.- efficient stream processing. full tutorial: ↓ read more
6
73
662
@skalskip92
SkalskiP
1 year
Qwen-VL-Plus is SACARY good! (better than GPT-4V). here it is casually solving Recaptcha!. - You don't have to give any additional instructions other than 'Solve it.'. - It can even mark the exact position of the objects it is looking for. ↓ it can do so much more
Tweet media one
22
100
671
@skalskip92
SkalskiP
1 year
Sports Analytics with GPT-4 Vision. I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
@skalskip92
SkalskiP
2 years
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
19
86
656
@skalskip92
SkalskiP
2 months
I missed VLMs. I worked on other stuff for the past few months, but I'm back! I'm fine-tuning Florence2 to extract data from documents in JSON format.
Tweet media one
20
43
665
@skalskip92
SkalskiP
1 year
- Object detection over HTTP? .- Easy! . We just open-sourced our inference server under Apache 2.0. Left terminal: @roboflow inference.Right terminal: video client
6
76
652
@skalskip92
SkalskiP
1 year
The traffic analysis project is done! The YouTube tutorial will be out tomorrow. Stay tuned!. Wait till flow counters appear around 0:06. Github repo:
17
102
642
@skalskip92
SkalskiP
1 year
SAM + MetaCLIP + ProPainter. produce masks: remove object: I'm working on combo space!
7
99
599
@skalskip92
SkalskiP
9 months
I'm experimenting with PaliGemma tonight. a single open-source model allowing you to:.- detect car (detection).- answer questions about its color and brand (VQA).- read license plate number (OCR). all that on a single consumer-grade GPU. is there any other model that can do it?
Tweet media one
25
76
620
@skalskip92
SkalskiP
1 year
it blows my mind to see things that are created using my code
Tweet media one
13
26
610
@skalskip92
SkalskiP
1 year
It took me ONE HOUR to craft this demo using supervision-0.18.0. - Three new annotators: PercentageBar, RoundedBox, and OrientedBox.- Enhanced LineZone feature for improved counting.- OBB (oriented bounding boxes) integration. ↓ read more . repo:
13
92
600
@skalskip92
SkalskiP
1 year
YOLO-World + EfficientSAM + StableDiffusion for language-guided inpainting. I was inspired yesterday by the work of @MrDravcan (see attached), and I decided to try to replicate it. SPOILER ALERT: it didn't quite work out for me. ↓ read more
16
95
596
@skalskip92
SkalskiP
3 months
working on a new demo - automated parking lot management. - keep track of how many cars go in and out - done.- read plates - done.- calculate the time spent in the parking lot - in progress. what do you think?
40
30
608
@skalskip92
SkalskiP
6 months
segment anything 2 (SAM2) is out; I have been waiting for this for a long time!. I spent most of my morning playing with the model. here's the initial version of my tutorial notebook. I'll be updating it to include all the cool stuff.
Tweet media one
@AIatMeta
AI at Meta
6 months
Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos. SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences. Details ➡️
12
61
600
@skalskip92
SkalskiP
11 months
time-in-zone (dwell time) tutorial is coming. this is the third time I'm trying to make this video; hopefully, the last one. I finally have a good use case - waiting time for service. here is the first iteration. what do you think?. link:
11
57
589
@skalskip92
SkalskiP
11 months
detecting small objects is hard. I spent some time today writing a short how-to guide on using supervision (in combination with the most popular CV libraries) to detect small objects. btw is that a good idea for a video tutorial?. link: ↓ read more
18
58
587
@skalskip92
SkalskiP
7 months
player clustering component of my Football AI project is pushed to GitHub. - feature extraction with SigLIP.- dimensionality reduction with UMAP.- clustering with KMeans . code:
@skalskip92
SkalskiP
7 months
no more new VLMs? . I'm finally working on a YouTube tutorial for my football AI project; the tutorial should be out next week. stay tuned:
12
59
595