Carl Doersch @CarlDoersch profile

Carl Doersch

@CarlDoersch

Followers

2K

Following

121

Statuses

56

Researcher at DeepMind

London, UK

Joined April 2017

Don't wanna be here? Send us removal request.

Carl Doersch

@CarlDoersch

9 months

We present a new SOTA on point tracking, via self-supervised training on real, unlabeled videos! BootsTAPIR achieves 67.4% AJ on TAP-Vid DAVIS with minimal architecture changes, tracks 10K points on a 50-frame video in 6 secs. Pytorch & JAX impl on Github.

6

67

318

Carl Doersch

@CarlDoersch

2 months

RT @dangengdg: What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompti…

0

148

0

Carl Doersch

@CarlDoersch

5 months

Want a robot to solve a task, specified in language? Generate a video of a person doing it, and then retarget the action to the robot with the help of point tracking! Cool collab with @mangahomanga during his student researcher stint at Google.

Homanga Bharadhwaj

@mangahomanga

5 months

Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset! 1/n

0

4

Carl Doersch

@CarlDoersch

7 months

Want to make a difference with point tracking? The medical community needs help tracking tissue deformation during surgery! Participate in the STIR challenge ( at MICCAI, deadline in September.

0

2

Carl Doersch

@CarlDoersch

7 months

RT @skandakoppula: We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point…

0

62

0

Carl Doersch

@CarlDoersch

8 months

RT @dimadamen: Can you win 2nd Perception Test Challenge? @eccvconf workshop: Diagnose Audio-visual MLM on ability…

0

15

0

Carl Doersch

@CarlDoersch

9 months

Joint work with @paulineluc_, @yangyi02, @dilaragoekay, @skandakoppula, @ankshgpta, Joe Heyward, Ignacio Rocco, @RGoroshin, @joaocarreira, Andrew Zisserman. Video credit to GDM’s robot soccer project:

0

1

3

Carl Doersch

@CarlDoersch

1 year

@notnotrishi Check out for a better motivation. That project would have been impossible using only SOTA optical flow or box-level tracking.

1

0

Carl Doersch

@CarlDoersch

1 year

@notnotrishi 1) Recovers after occlusions 2) Does not give bogus correspondences when points become occluded 3) Does not 'drift' as errors accumulate across long sequences 4) Queries don't have to come from the same video 5) Sintel and KITTI are and saturated don't cover the real world well

0

3

Carl Doersch

@CarlDoersch

1 year

Joint work with @yangyi02, Mel Vecerik, @joaocarreira @tdavchev, @JonathanScholz2, Andrew Zisserman, @yusufaytar, Stannis Zhou, @dilaragoekay, Ankush Gupta, @LourdesAgapito, @RaiaHadsell

0

1

4

Carl Doersch

@CarlDoersch

2 years

RT @dimadamen: 📢 Perception Test @ICCVConference now w/ Test Set. We invite submissions to 1st Perception Test- winners announced #ICCV2023…

0

9

0

Carl Doersch

@CarlDoersch

2 years

📢 The point tracking component of the Perception Test is now available! These are longer videos than TAP-Vid, so it should be an interesting challenge.

0

3

4

Carl Doersch

@CarlDoersch

2 years

It might seem artifical, but that's the point: the Perception Test is full of unusual, out-of-distribution situations that humans understand easily, but fool large vision and language models. Can't wait to see how challenge participants tackle it!

joao carreira

@joaocarreira

2 years

Our preprint on the Perception Test is now on arXiv: We will be benchmarking the cream of the crop multimodal video models in the Perception Test Challenge, happening in ICCV 2023. (1 / 2)

0

6

Carl Doersch

@CarlDoersch

2 years

Our dataset, published in NeurIPS Datasets and Benchmarks, covers 30 DAVIS videos and over 1K Kinetics videos and over 30K points. Each video took roughly 3 hr 20 min to annotate. We hope it can propel the community toward better physical scene understanding! (5/5)

1

0

13