Shivam Duggal Profile
Shivam Duggal

@ShivamDuggal4

Followers
907
Following
2K
Media
6
Statuses
103

PhD Student @MIT | Prev: Carnegie Mellon University @SCSatCMU | Research Scientist @UberATG

Joined June 2017
Don't wanna be here? Send us removal request.
@ShivamDuggal4
Shivam Duggal
3 months
Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: OpenAI o1) adjust compute budgets based on the input. Since different images demand diff. processing & memory, how can we enable vision systems to be adaptive ? 🧵.
10
65
475
@ShivamDuggal4
Shivam Duggal
1 year
Ecstatic to share that I will be starting my Ph.D. at MIT's @MITEECS @MIT_CSAIL! Grateful to the colleagues & professors I had the pleasure of interacting with over the past few years, specially my amazing advisor @pathak2206! Will miss @CarnegieMellon & SmithHall @roboVisionCMU.
17
4
167
@ShivamDuggal4
Shivam Duggal
3 years
Glad to share my first paper at CMU!.Special thanks to my amazing advisor @pathak2206.
@pathak2206
Deepak Pathak
3 years
How can we reconstruct high-fidelity 3D from a *single* 2D image? That too w/o any 3D supervision during training?. Our new #CVPR2022 paper, TARS recovers 3D shape & correspondence from a single image. Trained on just single-view internet images w/ poses.
0
1
47
@ShivamDuggal4
Shivam Duggal
3 months
We present a Recurrent & Adaptive Tokenizer – which iteratively encodes an image into compressed representations, each iteration involving not just more processing, but also additional memory resources in form of more tokens. **Don't spend your tokens all at once!**
Tweet media one
2
3
43
@ShivamDuggal4
Shivam Duggal
3 months
Grateful to my amazing advisors – Prof. @phillip_isola, Prof. Antonio Torralba & Prof. Bill Freeman @MITCSAIL. Excited by the direction of adaptive/dynamic representations for reasoning, video understanding etc. Paper: Code:
0
2
35
@ShivamDuggal4
Shivam Duggal
1 year
"What I cannot create, I do not understand". With the improved state of likelihood-based generative models, leveraging generative models for ("zero-shot") discriminative tasks seems promising!. Exciting collaboration with great friends, checkout @alexlioralexli's detailed thread!.
@alexlioralexli
Alex Li
1 year
Diffusion models have amazing image creation abilities. But how well does their generative knowledge transfer to discriminative tasks?. We present Diffusion Classifier: strong classification results with pretrained conditional diffusion models, *with no additional training*!. 1/9
0
1
28
@ShivamDuggal4
Shivam Duggal
2 years
Super excited and grateful to be a Siebel Scholar! 😀.Thank you @SiebelScholars, @SCSatCMU .Special thanks to my amazing advisor @pathak2206 !!.
@SiebelScholars
Siebel Scholars
2 years
We are thrilled to announce the #SiebelScholars Class of 2023. Congratulations to this exceptional community directly influencing the technologies, policies, economic and social decisions shaping our future! #SiebelClassOf2023 . Press Release:
Tweet media one
4
1
25
@ShivamDuggal4
Shivam Duggal
3 months
**Low-Complexity Art Hypothesis (Schmidhuber, 1997) – Minimum Token Length ~ Image Complexity.**. Plot on the right depicts strong alignment (lower-triangle matrix) of human-annotated image complexity with the reconstruction loss at varied token counts.
Tweet media one
1
1
24
@ShivamDuggal4
Shivam Duggal
3 months
Most exciting! At each iteration of recurrent rollout, previous latent tokens receive residual updates, while new computational memory (additional tokens) is introduced. New tokens give the existing ones the freedom to focus on specialized regions w/ sharper & sparser attention.
Tweet media one
1
3
22
@ShivamDuggal4
Shivam Duggal
3 months
**What factors decide the required image's representational capacity ?** .We analyze – Entropy (Low-Complexity Art Hypothesis), Familiarity (IID vs OOD) and Downstream Task & Downstream Model Strength (weaker models work well with fewer tokens) – to decide the image token count.
1
1
18
@ShivamDuggal4
Shivam Duggal
3 months
Weaker Models can manage w/ fewer tokens!.Models w/ relatively lower performance on GT data perform just as well on low-token reconstructions as on high-token ones (eg: AlexNet, BLIP-VLM at 32 tokens / image). Knowledge determines the benefit of increasing processing time/memory!
Tweet media one
1
2
16
@ShivamDuggal4
Shivam Duggal
3 months
Does it scale well ? **Yes**.Entire work was done on academic compute – on datasets of scale Imagenet-100, Imagenet-1K and COCO. But we study signs of scaling adaptive tokenizer to larger n/w, larger datasets (IN-100 vs IN-1K), longer training time, continuous vs discrete tokens.
Tweet media one
1
2
15
@ShivamDuggal4
Shivam Duggal
14 days
Excited to see what the brilliant team at @tangiblerobots is up to! Best wishes to @bipashasen31 and the team as they make innovative strides in household robotics.
@tangiblerobots
Tangible
14 days
Robots can do flips and play chess, but they still can’t grab a snack or clean your table. Teleoperation is the key to unlocking real dexterity. It’s not a compromise—it’s a proven, powerful approach that combines human intuition with robotic precision. We’re not just using
0
0
13
@ShivamDuggal4
Shivam Duggal
2 months
@david_rolnick Reviewer’s own words: .- paper is novel, experiments are solid & comprehensive, writing is clear & concise, rebuttal resolves all questions. - I will stick to 6, borderline accept. @iclr_conf Is it fine to ask for details on why the paper is still rated 6? It is so stressful!.
3
0
11
@ShivamDuggal4
Shivam Duggal
3 months
**Dataset Representations ~ Downstream Task.**.We sample variable tokens per image using Classification Acc. or Depth Error < Thres. as Token Selection Criterion (TSC). Using just 40% of the max tokens for the dataset, we can saturate/overfit the best Top-1 Classification Acc.
Tweet media one
1
1
10
@ShivamDuggal4
Shivam Duggal
2 years
Humbled to be a part of this recognition🙂.Thanks @eccvconf !.
@eccvconf
European Conference on Computer Vision #ECCV2026
2 years
List of #ECCV2022 Outstanding Reviewers. Thank you all for your service! 👏.
0
0
10
@ShivamDuggal4
Shivam Duggal
2 months
Distribution Matching (DMD) applied to distill a bi-directional model into a causal auto-regressive video generative model. Incredibly strong results!! Congratulations @TianweiY and team!.
@TianweiY
Tianwei Yin
2 months
CausVid trains a four-step autoregressive diffusion model to generate videos. Unlike previous bidirectional diffusion models that denoise all frames simultaneously, CausVid generates videos frame by frame. This approach enables users to watch the video while it is being
Tweet media one
0
0
8
@ShivamDuggal4
Shivam Duggal
4 years
Our paper GeoSim got nominated for best paper award at CVPR'21 😃 .Special thanks to all the co-authors: @t_mux, @frieda_rong, @ShenlongWang, @skywalkeryxc, Siva, Shangjie, @meyumer and @RaquelUrtasun. Paper: Project-page:
@RaquelUrtasun
Raquel Urtasun
4 years
Our papers GeoSim ( and MP3 ( have been nominated for best paper award @CVPR. Congrats @t_mux, @frieda_rong, @ShivamDuggal4, @ShenlongWang, @skywalkeryxc, Siva, Shangjie, @meyumer, @sergioksas, Abbas!.
2
0
8
@ShivamDuggal4
Shivam Duggal
1 year
@HaoyuXiong1 It was great interacting with you, and so much fun having you around! Thanks @HaoyuXiong1 !.
0
0
3
@ShivamDuggal4
Shivam Duggal
1 year
@vincesitzmann @pathak2206 @MITEECS @MIT_CSAIL @CarnegieMellon @roboVisionCMU Thanks Vincent! Looking forward to interacting and working with you :D.
0
0
2
@ShivamDuggal4
Shivam Duggal
1 year
Super cool work, congratulations!!.
@_ellisbrown
Ellis Brown
1 year
🤖 VIRL 🌎.Grounding Virtual Intelligence In Real Life. 🧐How can we embody agents in environments as rich/diverse as those we inhabit, without real hardware & control constraints?. 🧐How can we ensure internet-trained vision/language models will translate to real life globally?.
0
0
1
@ShivamDuggal4
Shivam Duggal
3 months
@vincesitzmann Thank you @vincesitzmann! I am glad you liked it :D.
0
0
1
@ShivamDuggal4
Shivam Duggal
1 year
0
0
1
@ShivamDuggal4
Shivam Duggal
1 year
@justachetan @MITEECS @MIT_CSAIL @pathak2206 @CarnegieMellon @roboVisionCMU Thanks Aditya! It was great meeting you this summer :D.
0
0
1
@ShivamDuggal4
Shivam Duggal
1 year
0
0
1