Thrilled to share that:
🎓I defended my PhD thesis last week!
👨🏫I’ll be joining
@Cornell
@CornellCIS
as an Assistant Professor in Fall 2024!
A huge thank you to my amazing advisors Antonio Torralba and
@RaquelUrtasun
, my committee members
@HoiemDerek
and
@vincesitzmann
, ..
1/2
Can you match images with little or no overlaps?
Humans can🧠but most existing methods fail😰
Our
#CVPR2022
paper shoots camera rays through the scene to form “virtual correspondences” & uses epipolar geometry.
w/
@ajyang99
@ShenlongWang
@RaquelUrtasun
Antonio Torralba
1/n
I cannot imagine what my PhD life would be like without “Friends”. Those characters have been with me through all the ups and downs. RIP Chandler Bing.
We are devastated to learn of Matthew Perry’s passing. He was a true gift to us all. Our heart goes out to his family, loved ones, and all of his fans.
Tired of hand-crafting energy functions for your inverse problems? Bored of deep nets predicting estimations that are not coherent with the observations?
Deep feedback inverse problem solver to the rescue! It’s fast, robust, and generic!
#ECCV2020
#Spotlight
Do you find using COLMAP to pre-compute camera poses before training NeRF burdensome? Check out BARF 🤮
Curious how Bundle Adjustment can be combined with NeRF? Check out BARF 🤮
Interested in some sick results? Check out BARF 🤮
BARF allows you to train NeRF w/o camera poses!
Can you train NeRF without knowing the camera poses? YES! Unfortunately it’s not as simple as pose.requires_grad_(True).
Brace yourself, because we’re introducing BARF — our results are sick! 🤮
arXiv:
project page:
🧵⬇️ (1/4)
Introducing UniSim, a scalable neural closed-loop sensor simulator for self-driving!
Given a *SINGLE* pass of data, UniSim can
📷simulate images+LiDAR
🚗insert/remove actors
🚦manipulate actors behavior
🖼️model FG+BG
🔥EXTRAPOLATE views
Stay tuned for more!
#CVPR2023
#Highlight
We just uploaded two latest talk happening at the
#MIT
Vision and Graphics Seminar 📽️
Featuring the NVIDIA Graduate Fellow
@chenhsuan_lin
, and the ECCV Best Paper Winner Zachery Teed.
Check out the links below - enjoy 😋
The results are amazing! Personally I really like the visual quality check stage. Using VLMs for filtering is so powerful (eg, DreamSync from
@huyushi98
and others) and definitely seems the way to go.
Can generative AI imagine what Alice saw in her journey in the Wonderland 🏞️🚶♀️? Introducing WonderJourney: Create a journey (a long sequence of diverse yet connected 3D scenes) from a single image or text! 🧵1/N
Web:
arxiv:
Introducing ClimateNeRF!
ClimateNeRF fuses physical simulations with neural radiance field models of scenes, producing realistic videos of physical phenomena in those scenes.
❄️⛈️🌊😷
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis
We model the world as a set of 3D Gaussians that move & rotate over time. This extends Gaussian Splatting to dynamic scenes, with accurate novel-view synthesis and dense 3D trajectories.
Always wonder what I would look like if I'm half-blooded and a few years younger - finally figure it out today 🤣
@lucyrchai
and
@jswulff
made it happen!
Make sure to check out their fantastic ICLR paper and demo at:
Super cool work on evaluating T2I generative models! It addresses several limitations from TIFA 👩💻
Yet another amazing work from the authors of TIFA! Congrats
@huyushi98
,
@RanjayKrishna
!
🚨New T2I Evaluation!🚨
We introduce Davidsonian Scene Graph (DSG) for reliable T2I evaluation with questions that:
- are atomic and unique
- cover full text prompt semantics (w/o hallucination)
- and have valid consistencies
@GoogleAI
@uncnlp
@uwnlp
🧵
Check out our latest deep video compressor at
#ECCV2020
. Super simple, yet surprisingly effective.
All you need is a conditional entropy model + internal learning ;)
Joint work w/
@jerryjliu98
(leading author),
@ShenlongWang
, Meet, Rui, Pranaab,
@RaquelUrtasun
Excited to share our
#NeurIPS2022
work on large-scale, consistent, and realistic 3D world generation!
The key idea is to recursively simulate new image observations 🖼️ and integrate them into a coherent 3D map 🗺️
PS, It's SGAM, not SCAM 😈
w/
@YuanShe40816559
,
@ShenlongWang
#NeurIPS2022
Tired of generating 3D objects🚗?
Interested in generating large-scale, coherent 3D scenes without drifting🏔️?
Introducing SGAM - Simultaneous Generation And Mapping - a novel 3D scene generation algorithm that allows us to do so
Project link:
Hello Twitterverse! I'm excited to share our latest work on 3D shape reconstruction, where we learn SDF representations *trained* from single-view images! 🎯
To appear at NeurIPS 2020
#neurips2020
Paper:
Website:
(1/4) ⬇️
my mentors, my letter writers, and last but not least, my FANTASTIC, WONDERFUL, BRILLIANT collaborators🧠. Nothing would be possible without your support and help along the journey!
2/2
Can LMs solve reasoning tasks without showing their work? "Implicit Chain of Thought Reasoning via Knowledge Distillation" teaches LMs to reason internally to solve tasks like 5×5 multiplication. Here's how we bypass human-like step-by-step reasoning 1/6
VCs enable us to match extreme-view images, and succeeds in cases where existing SfM systems fail. Combined with classic correspondences, we produce accurate pose estimation in a wide range of scenarios, with downstream tasks in multi-view stereo and novel view synthesis.
5/n
Virtual correspondences (VCs) conform with epipolar geometry.
We establish virtual correspondences by shooting camera rays through the coarse shape priors, and then refine the relative poses and respective 3D points in a generalized bundle adjustment framework.
4/n
Glad to share our latest effort towards 3D shape completion in the wild.
Our models is able to estimate both 3D canonical shape and 6-DoF pose from unaligned, noisy, real world point clouds WITHOUT any shape or pose supervision.
#ECCV2020
#Spotlight
@jbhuang0604
@yen_chen_lin
and I took his course this semester. He is such an energetic, nice, and fantastic teacher - perhaps one of, if not the best linear algebra instructor. It's a pity that the course transferred to online due to COVID-19.
Transformer to the rescue! We exploit transformer to refine existing instance segmentation models and achieve SOTA performance. Our model can generalize across a variety of backbones even w/o re-training!
Check out at
#CVPR2020
[11am-1pm and 11pm-1am all PDT] our novel LiDAR simulator that has little to no domain gap when testing end-to-end autonomous driving systems. 11:00 AM - 1:00 PM PDT and 11:00 PM - 1:00 AM PDT at ,
#SelfDrivingCars
@UberATG
@Uber
Excited to share our work on "Implicit Neural Representations with Periodic Activations"
We show how to fit complex signals, such as room-scale SDFs, video, & audio, and supervise implicit reps via their gradients to solve boundary value problems! (1/n)
We define virtual correspondences, which unlike classic correspondences, do not describe the same, co-visible 3D point in the scene, and therefore do not need to share the same appearance or semantics. Instead, we only require their associated camera rays to intersect in 3D.
3/n
New short course on sophisticated RAG (Retrieval Augmented Generation) techniques is out! Taught by
@jerryjliu0
and
@datta_cs
of
@llama_index
and
@truera_ai
, this teaches advanced techniques that help your LLM generate good answers.
Topics include:
- Sentence-window retrieval,
Excited to share our new work on learning to discover neural programs that strongly generalize and outperform hand-coded algorithms for sorting (quick sort) and other tasks, in number of execution steps.
w/
@FelixAxelGimeno
@pushmeet
@OriolVinyalsML
Humans can relate two images taken at the same scene, even if they have no overlaps. Such a remarkable capability comes from our prior knowledge of the underlying geometry. Inspired by this, we leverage learned shape priors of foreground objects (e.g., people) in the scene.
2/n
Key advantages:
1. No restrictions on the forward process! For instance, it does not have to be differentiable!
2. No more hand-crafted energy functions!
3. Way faster and way more robust!
Key idea: The difference between the observation and current estimation provides very rich cues regarding how to update the latent parameters. We thus leverage the feedback signal provided by the forward process and learn an iterative update model.
We present Neural Subdivision at
#SIGGRAPH2020
this week with Vladimir, Siddhartha, Noam, and
@_AlecJacobson
. It subdivides shapes using neural nets and generalize well to novel shapes!
Talk:
Code:
Project:
Computer Vision folks often talk about 3D and I often question -- what good is their 3D if I cannot see behind the occlusions? Any human being know if they move their head, they can easily reveal the occluded objects in a scene. Here is an example revealing the occlusions!