SkalskiP @skalskip92 profile

SkalskiP

@skalskip92

Followers

30K

Following

15K

Media

2K

Statuses

8K

Open-source Lead @roboflow. VLMs. GPU poor. Dog person. Coffee addict. Dyslexic. | GH: https://t.co/dEmzMDGq5H | HF: https://t.co/4Lx1Yw34W7

Kraków, Polska

Joined February 2014

Don't wanna be here? Send us removal request.

SkalskiP

@skalskip92

7 months

football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:

147

984

8K

SkalskiP

@skalskip92

8 months

polish TV is using computer vision to enhance the viewer experience for sports broadcasts:. - FIFA-like radar overlays.- player recognition.- pass distance measurement.- ball speed and trajectory tracking during shots

195

1K

14K

SkalskiP

@skalskip92

5 months

supervision, the open-source library I created a year ago, has crossed 20,000 stars on GitHub this weekend!. thank you to everyone who helped me build this project!. it took us 3,500+ commits, 850+ PRs and 80+ contributors to do it. repository:

63

738

7K

SkalskiP

@skalskip92

8 months

supervision - a computer vision library I created - just crossed 15,000 stars on GitHub!. BBBRRRRRR!. link:

48

596

6K

SkalskiP

@skalskip92

4 months

I used to hate working on projects like this. crazy how the word has changed over the past year; labeling is dead!

48

362

5K

SkalskiP

@skalskip92

1 year

RIP image annotation companies. Fully automated image labeling with GroundingDINO + SAM + OpenAI Vision API. code:

64

390

3K

SkalskiP

@skalskip92

1 year

supervision-0.13.0 is out! Now you can effortlessly build advanced video analytics. Trackers, Zones, Annotators, and much more. GitHub repository:

35

532

3K

SkalskiP

@skalskip92

1 year

here is the final version of my vehicle speed estimation demo. read the thread below to learn how I built it. I will cover: .- detection .- tracking .- perspective transformation .- speed calculation .- some bonus ideas. ↓

70

279

3K

SkalskiP

@skalskip92

1 year

REAL-TIME object detection WITHOUT TRAINING. YOLO-World is a new SOTA open-vocabulary object detector that outperforms previous models in terms of both accuracy and speed. 35.4 AP with 52.0 FPS on V100. ↓ read more

33

377

3K

SkalskiP

@skalskip92

5 months

this might be the coolest-looking football AI visualization I ever created

SkalskiP

@skalskip92

7 months

football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:

48

204

2K

SkalskiP

@skalskip92

11 months

supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! . thank you to everyone who helped me build this project!. it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:

21

298

2K

SkalskiP

@skalskip92

8 months

Apple released 4M-21 last week -any-to-any vision-language model.(it almost flew under my radar because of CVPR). Apache-2.0 !!!. - image captioning.- depth estimation.- object detection.- instance segmentation.- image generation.- and much more, all in one modal. ↓ read more

17

328

2K

SkalskiP

@skalskip92

9 months

almost fully functional version of my football AI project. today, I added player tracking using ByteTrack and projection of players onto the map. code coming soon:

50

206

2K

SkalskiP

@skalskip92

7 months

supervision-0.22.0 is coming out today. one of the things we release is Mediapipe integration along with default visualizers for face and body pose keypoints. link:

23

262

2K

SkalskiP

@skalskip92

2 years

supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:

44

304

2K

SkalskiP

@skalskip92

11 days

Alibaba dropped QWEN2.5VL yesterday; I spend all night working on fine-tuning tutorial. this notebook covers.- dataset preparation.- tokenization.- training with LoRA/QLoRA (for max performance on low-power devices).- fine-tuned model evaluation. link:

24

259

2K

SkalskiP

@skalskip92

3 months

this guy took my Football AI project and used it to build a 3D representation of the game; CRAAAAAZY!.

SkalskiP

@skalskip92

7 months

football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:

29

161

2K

SkalskiP

@skalskip92

7 months

lots of you asked if soccer AI can detect the ball; I just added this functionality. code:

SkalskiP

@skalskip92

7 months

football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:

47

164

2K

SkalskiP

@skalskip92

6 months

FLUX inpainting. I know I'm late to the FLUX party, but I hope it's still okey if I drop something cool.

44

194

2K

SkalskiP

@skalskip92

1 year

I'm starting to get more and more serious with YOLO-World; trying to solve real-life problems. I wanted to see if YOLO-World could recognize that the holes had been filled out. It was pretty tricky, but I learned a little about prompting. ↓ read more

16

164

1K

SkalskiP

@skalskip92

1 year

The traffic analysis project is growing! The YouTube tutorial will be out this week. Progress: I can now identify that the car is in a specified zone. Next: Match entrance and exit zones for every tracker ID to analyze the traffic flow. GitHub repo:

20

287

1K

SkalskiP

@skalskip92

1 year

Chat with the webcam using @OpenAI vision API

45

176

1K

SkalskiP

@skalskip92

7 months

using GPT-4o to clean up my dataset automatically

27

50

1K

SkalskiP

@skalskip92

9 months

I'm taking my football/soccer project to the next level. today, I worked on detecting players, referees, and the ball and mapping their positions from video frames to positions on the field. ↓ read more

58

115

1K

SkalskiP

@skalskip92

7 months

no more new VLMs? . I'm finally working on a YouTube tutorial for my football AI project; the tutorial should be out next week. stay tuned:

29

133

1K

SkalskiP

@skalskip92

6 months

over 200 hours of work compressed into a 90-minute video. the football AI tutorial is finally out!. link to video: ↓ key takeaways

32

159

1K

SkalskiP

@skalskip92

5 months

analyzing customer behavior in retail space. great project done by Abdelmouhaimen Sarhane - taking my supervision tutorials to the next level. supervision:

26

181

1K

SkalskiP

@skalskip92

14 days

.@Arsenal is looking for people to help them build football Al

SkalskiP

@skalskip92

7 months

football AI code is finally open-source. - player detection and tracking.- team clustering.- camera calibration. I still need to work on README; don't judge me on that. code:

30

141

1K

SkalskiP

@skalskip92

3 months

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware. check out this SAM2 vs SAMURAI comparison!. - paper: - code: - license: Apache-2.0

17

183

1K

SkalskiP

@skalskip92

9 months

I fine-tuned my first vision-language model. PaliGemma is an open-source VLM released by @GoogleAI last week. I fine-tuned it to detect bone fractures in X-ray images. thanks to @mervenoyann and @__kolesnikov__ for all the help!. ↓ read more

30

193

1K

SkalskiP

@skalskip92

3 months

tomorrow, I'll be hosting a talk for @MIT; I'll be speaking about open-source computer vision tools. 12:00 PM PST / 03:00 PM EST / 09:00 PM CET. we'll be streaming on X:

15

126

1K

SkalskiP

@skalskip92

1 year

ball and player 3d pose estimation - easily one of the coolest computer vision projects I have ever made. repository:

23

183

1K

SkalskiP

@skalskip92

11 months

detecting AI-generated text. researchers studied the impact of ChatGPT on AI conference peer reviews, confirming what we all knew. paper: ↓ read more

32

114

1K

SkalskiP

@skalskip92

1 year

Nov 6th, 2023: We love you guys!.Nov 17th, 2023: Sam is fired!

40

157

1K

SkalskiP

@skalskip92

11 months

manual data labeling is (almost) dead. 1,500,000 images auto-annotated within 2 weeks of release. now, we also support automatic segmentation labeling. ↓ read more about open-source models that power this feature

51

136

1K

SkalskiP

@skalskip92

4 months

supervision-0.24.0 is out! you can finally count per-class line crossings. many of you have been asking for this, now we have it! . it took me barely 30 minutes to make this demo using supervision!. link:

11

141

1K

SkalskiP

@skalskip92

1 year

YOLOv9 is out. looks like a new SOTA real-time object detector. I'm already working on a custom training tutorial

AK

@_akhaliq

1 year

YOLOv9. Learning What You Want to Learn Using Programmable Gradient Information. Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth. Meanwhile, an appropriate

24

159

1K

SkalskiP

@skalskip92

9 months

I need to take a break from football AI for a while. I plan to experiment with PaliGamma, Google's new open-source VLM, over the next few days. but don't worry, I'll be back. In the meantime, the football AI code is slowly making its way to this repo.

37

119

1K

SkalskiP

@skalskip92

1 year

train YOLOv9 on your dataset tutorial. - run inference with a pre-trained COCO model.- fine-tune model on custom dataset .- evaluate the trained model.- run inference with a fine-tuned model. blogpost: ↓ read more

14

155

1K

SkalskiP

@skalskip92

1 year

supervision-0.13.0 is out! Now you can effortlessly count crops in the fields with a single drone flyby. GitHub repository:

10

150

993

SkalskiP

@skalskip92

2 months

in January 2025, I'm launching a new series on my YouTube channel - VLMs zero-to-hero. honestly, I know shit about VLMs, and I want this series to change that. I've selected (for now) 19 papers that I plan to tell you about. link:

21

138

995

SkalskiP

@skalskip92

9 months

taking my football/soccer AI to the next level. - image embeddings.- dimension reduction.- player clustering.- awesome visualizations. code: (code migration in progress. ). ↓ read more

31

90

956

SkalskiP

@skalskip92

1 year

what stops you from using supervision today? . link:

23

105

919

SkalskiP

@skalskip92

1 year

looking for OpenAI-4V alternatives?. - LLaVA.- BakLLaVA.- CogVLM.- Fuyu-8B.- Qwen-VL. I am working on a short blog post discussing some GPT-4V alternatives. It will probably come out today. links all resources:

SkalskiP

@skalskip92

1 year

What OpenAI-4V alternatives would you recommend?.- LLaVA.- BakLLaVA.

42

148

912

SkalskiP

@skalskip92

6 months

Florence-2 + SAM-2. SAM-2 doesn't understand language on its own, but Florence-2 does. I'm having a lot of fun with this combo! The first version of my @huggingface space is already online. link:

16

139

917

SkalskiP

@skalskip92

4 months

awesome job by Eric Fenaux building on my football AI project. using signal processing, he managed to drastically improve the quality of ball tracking (yellow dot). I learned so much from his notebook: ↓ read more

11

104

925

SkalskiP

@skalskip92

7 months

Florence-2 fine-tuning YouTube tutorial is finally out! (sorry it took me so long). - running the pre-trained model with different vision tasks.- configuring LoRA.- training and benchmarking.- Florence-2 vs. top vision model. link: ↓ key takeaways

16

127

911

SkalskiP

@skalskip92

4 months

would you watch a stream where I show how to build analytics system like this with supervision?. - detection filtering with polygon zones.- object tracking.- customizable annotators.- line zones with in/out counters.- per-class counts. supervision repo:

SkalskiP

@skalskip92

4 months

supervision-0.24.0 is out! you can finally count per-class line crossings. many of you have been asking for this, now we have it! . it took me barely 30 minutes to make this demo using supervision!. link:

47

72

913

SkalskiP

@skalskip92

1 year

Automated @NBA match commentary using @OpenAI vision and TTS (with code!). Everyone is bragging about projects that generate automatic video commentary, but no one is showing the code. I did it while waiting for the plane. code:

42

136

903

SkalskiP

@skalskip92

11 months

manual data labeling is almost dead . define prompts, tweak the confidence threshold, and make manual adjustments if necessary. this feature is now available to all users, even on free accounts. read more:

14

123

906

SkalskiP

@skalskip92

6 months

SAM2 can be used for ReID (reidentification) across multiple camera views. top video - reference video; bottom two videos - new previously unseen camera angles. I only annotated 3 frames from the reference video

22

109

895

SkalskiP

@skalskip92

8 months

I spent most of today preparing for CVPR 2024. "Matching Anything by Segmenting Anything" particularly caught my attention. Here are the fast open-vocabulary tracking examples (MASA + YOLO-World). link: ↓ read more

8

127

887

SkalskiP

@skalskip92

8 months

Florence-2 is finally out! 1 model; 10+ computer vision tasks!. ↓ key takeaways are listed below. see my blog post for details. link:

22

122

869

SkalskiP

@skalskip92

6 months

Florence2 + SAM2 + FLUX.1 - prototype. - Florence2 - open vocabulary detection.- SAM2 - box to mask.- FLUX.1 - inpainting. I have tried to build it after work for the past 3 days. getting closer

21

123

857

SkalskiP

@skalskip92

9 days

2+ years of making computer vision tutorials. YOLOv11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, Qwen2.5-VL, and dozens of other models. "resource-intense place for people who want to get more in-depth into computer vision". link:

15

143

1K

SkalskiP

@skalskip92

1 year

how to calculate the TIME objects spend IN THE ZONE? - that's the topic of my next tutorial. here's a short (and a bit creepy) demo I built a few months ago. do you have ideas for a less creepy use case for this tech?. github repository:

52

125

842

SkalskiP

@skalskip92

8 months

supervision 0.21.0 is launching tomorrow. this update includes VertexLabelAnnotator, allowing you to annotate skeleton vertices with custom text and color. link:

15

123

837

SkalskiP

@skalskip92

11 months

taking traffic analysis to the next level with supervision-0.19.0. speed estimation + 3d roead visualization. link: ↓ read more

12

107

846

SkalskiP

@skalskip92

22 days

latest transformers release added support for pose estimation with ViTPose and ViTPose++ (Apache 2.0 license). upcoming supervision release will add full support for ViTPose and ViTPose++, along with useful utilities for keypoint detection models.

19

108

850

SkalskiP

@skalskip92

6 months

ball tracking tutorial. after a short break due to the SAM2 release, I'm back working on football/soccer AI. I just published a very simple ball tracking tutorial; I'm curious what you think. link: ↓ key takeaways

21

81

843

SkalskiP

@skalskip92

5 months

my new open-source project. anyone who has played around with PaliGemma, Florence-2 or Qwen-VL knows that the barrier to entry is HIGH. for the past 2-3 weeks, I've been working on a new open-source project that aims to close that gap. link:

18

143

834

SkalskiP

@skalskip92

1 year

analyzing store traffic to find the most frequently visited areas. super demo created by @Hine__Po - member of Supervision community. link to repo if you want to build something over the weekend:

13

145

801

SkalskiP

@skalskip92

1 year

The YOLO-World YouTube tutorial is out! . please, let us know what you think!. - model architecture .- processing images and video in Colab .- prompt engineering and detection refinement .- pros and cons of the model. watch here: ↓ more resources

11

132

799

SkalskiP

@skalskip92

6 months

new tutorial: how to use SAM2 for video segmentation. - load SAM2 for video processing.- data preprocessing.- segment and track one object.- refine predictions.- propagate prompts across video.- segment and track multiple objects.- limitations. link: ↓

18

117

797

SkalskiP

@skalskip92

4 months

I managed to fine-tune @OpenAI GPT-4o for object-detection task!!!. here's a veeeery dirty colab:

21

68

801

SkalskiP

@skalskip92

11 months

now you can run real-time object detection on multiple streams with 10 lines of code. link: ↓ code snippet

13

137

780

SkalskiP

@skalskip92

5 months

#1 on GitHub for the first time!. so close to 20k stars! BRRRRRRRRRRRR!. link:

21

45

754

SkalskiP

@skalskip92

6 months

perspective transformation tutorial. I know many of you have been waiting for this tutorial for a long time, and it's finally here!. link: ↓ key takeaways

10

93

758

SkalskiP

@skalskip92

10 months

it took us a while, but the supervision-0.20.0 release will finally add support for key points. what are your thoughts on annotators? so far, we only have EdgeAnnotator and VertexAnnotator. supervision repo:

21

93

729

SkalskiP

@skalskip92

11 months

YOLOv9 tutorial: train model on custom dataset. - running inference with pre-trained COCO weights .- fine-tuning the model on a custom dataset .- model evaluation .- model deployment. sorry it took me so long; hope you like it.

15

97

743

SkalskiP

@skalskip92

2 months

PaliGemma2 for image to JSON data extraction. - used google/paligemma2-3b-pt-336 checkpoint; I tried to make it happen with 224, but 336 performed a lot better.- trained on A100 with 40GB VRAM.- trained with LoRA. colab with complete fine-tuning code:

20

93

749

SkalskiP

@skalskip92

1 year

supervision-0.15.0 is out! This time, we bring highly customizable annotators. We added eight annotators - box, mask, ellipse, label, circle, corner, trace, and blur. But the best part is. you can freely mix them!. GitHub repository:

9

122

720

SkalskiP

@skalskip92

6 months

some ball trajectory analysis for my upcoming YouTube football AI tutorial. stay tuned; I'm dropping the tutorial this week:

11

67

737

SkalskiP

@skalskip92

9 months

YOLO is the craziest model family. Each version is created by a different organization. "Compared with YOLOv9-C, YOLOv10-B has 46% less latency and 25% fewer parameters for the same performance.". I'll try to test it today. ↓ links

11

81

728

SkalskiP

@skalskip92

1 year

improving object counting logic. today I solved an interesting bug that has existed in my library for a loooooong time. repository: ↓ WARNING: lots of math in the thread below

7

78

731

SkalskiP

@skalskip92

1 year

Easily one of the most exciting projects built with Supervision!. Our community member Vriza Wahyu Saputra built this fantastic ball juggling counting demo using the moving LineZone available in our API.

12

90

698

SkalskiP

@skalskip92

1 year

Am I the last person who didn't know about OpenAI Cookbook?. link:

23

85

695

SkalskiP

@skalskip92

1 year

parking occupancy analysis. calculation of percentage occupancy in individual parking zones. all this was done with supervision: btw, @UenoLeo is cooking a blog post covering this project, so stay tuned!. ↓ read more

13

94

692

SkalskiP

@skalskip92

10 months

support for pose estimation and key point detection soon in the supervision. you can expect connectors for the most popular models and the first annotators in the next supervision release. can't wait to build demos like this with supervision

13

80

693

SkalskiP

@skalskip92

11 months

I love watching other people build cool demos with the supervision library; traffic analysis examples built by Anant Jaiswal. - object tracking.- zone counting.- heat-map analysis. link:

4

90

692

SkalskiP

@skalskip92

11 months

smart self-service checkout powered by YOLOv9. the value of the basket is updated live based on its changing content; what else should I add?. demo build with supervision:

14

84

694

SkalskiP

@skalskip92

1 year

What papers should I read to expand my knowledge of Transformers?. Please send links in the comments and write why this paper is worth reading. Thanks for your help!

32

99

670

SkalskiP

@skalskip92

3 months

can't wait to spend some of this money on open-source! .

43

31

685

SkalskiP

@skalskip92

1 year

speed estimation tutorial is finally out!. - object detection.-multi-object tracking.- filtering detections with polygon zone.- perspective transformation and speed estimation. link: below are some interesting visualizations I created for this video. ↓

13

111

666

SkalskiP

@skalskip92

10 months

new YouTube tutorial: compute dwell time using computer vision in live streams. (seems easy, yet tricky). - static file vs stream processing.- preventing growing latency and frame buffer overflow.- efficient stream processing. full tutorial: ↓ read more

6

73

662

SkalskiP

@skalskip92

1 year

Qwen-VL-Plus is SACARY good! (better than GPT-4V). here it is casually solving Recaptcha!. - You don't have to give any additional instructions other than 'Solve it.'. - It can even mark the exact position of the objects it is looking for. ↓ it can do so much more

22

100

671

SkalskiP

@skalskip92

1 year

Sports Analytics with GPT-4 Vision. I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.

SkalskiP

@skalskip92

2 years

supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:

19

86

656

SkalskiP

@skalskip92

2 months

I missed VLMs. I worked on other stuff for the past few months, but I'm back! I'm fine-tuning Florence2 to extract data from documents in JSON format.

20

43

665

SkalskiP

@skalskip92

1 year

- Object detection over HTTP? .- Easy! . We just open-sourced our inference server under Apache 2.0. Left terminal: @roboflow inference.Right terminal: video client

6

76

652

SkalskiP

@skalskip92

1 year

The traffic analysis project is done! The YouTube tutorial will be out tomorrow. Stay tuned!. Wait till flow counters appear around 0:06. Github repo:

17

102

642

SkalskiP

@skalskip92

1 year

SAM + MetaCLIP + ProPainter. produce masks: remove object: I'm working on combo space!

7

99

599

SkalskiP

@skalskip92

9 months

I'm experimenting with PaliGemma tonight. a single open-source model allowing you to:.- detect car (detection).- answer questions about its color and brand (VQA).- read license plate number (OCR). all that on a single consumer-grade GPU. is there any other model that can do it?

25

76

620

SkalskiP

@skalskip92

1 year

it blows my mind to see things that are created using my code

13

26

610

SkalskiP

@skalskip92

1 year

It took me ONE HOUR to craft this demo using supervision-0.18.0. - Three new annotators: PercentageBar, RoundedBox, and OrientedBox.- Enhanced LineZone feature for improved counting.- OBB (oriented bounding boxes) integration. ↓ read more . repo:

13

92

600

SkalskiP

@skalskip92

1 year

YOLO-World + EfficientSAM + StableDiffusion for language-guided inpainting. I was inspired yesterday by the work of @MrDravcan (see attached), and I decided to try to replicate it. SPOILER ALERT: it didn't quite work out for me. ↓ read more

16

95

596

SkalskiP

@skalskip92

3 months

working on a new demo - automated parking lot management. - keep track of how many cars go in and out - done.- read plates - done.- calculate the time spent in the parking lot - in progress. what do you think?

40

30

608

SkalskiP

@skalskip92

6 months

segment anything 2 (SAM2) is out; I have been waiting for this for a long time!. I spent most of my morning playing with the model. here's the initial version of my tutorial notebook. I'll be updating it to include all the cool stuff.

AI at Meta

@AIatMeta

6 months

Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos. SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences. Details ➡️

12

61

600

SkalskiP

@skalskip92

11 months

time-in-zone (dwell time) tutorial is coming. this is the third time I'm trying to make this video; hopefully, the last one. I finally have a good use case - waiting time for service. here is the first iteration. what do you think?. link:

11

57

589

SkalskiP

@skalskip92

11 months

detecting small objects is hard. I spent some time today writing a short how-to guide on using supervision (in combination with the most popular CV libraries) to detect small objects. btw is that a good idea for a video tutorial?. link: ↓ read more

18

58

587

SkalskiP

@skalskip92

7 months

player clustering component of my Football AI project is pushed to GitHub. - feature extraction with SigLIP.- dimensionality reduction with UMAP.- clustering with KMeans . code:

SkalskiP

@skalskip92

7 months

no more new VLMs? . I'm finally working on a YouTube tutorial for my football AI project; the tutorial should be out next week. stay tuned:

12

59

595