![Carl Doersch Profile](https://pbs.twimg.com/profile_images/1807068507441438720/WGOtShBM_x96.jpg)
Carl Doersch
@CarlDoersch
Followers
2K
Following
121
Statuses
56
RT @dangengdg: What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompti…
0
148
0
Want a robot to solve a task, specified in language? Generate a video of a person doing it, and then retarget the action to the robot with the help of point tracking! Cool collab with @mangahomanga during his student researcher stint at Google.
Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset! 1/n
0
0
4
RT @skandakoppula: We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point…
0
62
0
RT @dimadamen: Can you win 2nd Perception Test Challenge? @eccvconf workshop: Diagnose Audio-visual MLM on ability…
0
15
0
Joint work with @paulineluc_, @yangyi02, @dilaragoekay, @skandakoppula, @ankshgpta, Joe Heyward, Ignacio Rocco, @RGoroshin, @joaocarreira, Andrew Zisserman. Video credit to GDM’s robot soccer project:
0
1
3
@notnotrishi Check out for a better motivation. That project would have been impossible using only SOTA optical flow or box-level tracking.
1
0
0
@notnotrishi 1) Recovers after occlusions 2) Does not give bogus correspondences when points become occluded 3) Does not 'drift' as errors accumulate across long sequences 4) Queries don't have to come from the same video 5) Sintel and KITTI are and saturated don't cover the real world well
0
0
3
Joint work with @yangyi02, Mel Vecerik, @joaocarreira @tdavchev, @JonathanScholz2, Andrew Zisserman, @yusufaytar, Stannis Zhou, @dilaragoekay, Ankush Gupta, @LourdesAgapito, @RaiaHadsell
0
1
4
RT @dimadamen: 📢 Perception Test @ICCVConference now w/ Test Set. We invite submissions to 1st Perception Test- winners announced #ICCV2023…
0
9
0
It might seem artifical, but that's the point: the Perception Test is full of unusual, out-of-distribution situations that humans understand easily, but fool large vision and language models. Can't wait to see how challenge participants tackle it!
Our preprint on the Perception Test is now on arXiv: We will be benchmarking the cream of the crop multimodal video models in the Perception Test Challenge, happening in ICCV 2023. (1 / 2)
0
0
6