Shivam Duggal @ShivamDuggal4 profile

Shivam Duggal

@ShivamDuggal4

Followers

907

Following

2K

Media

6

Statuses

103

PhD Student @MIT | Prev: Carnegie Mellon University @SCSatCMU | Research Scientist @UberATG

Joined June 2017

Don't wanna be here? Send us removal request.

Shivam Duggal

@ShivamDuggal4

3 months

Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: OpenAI o1) adjust compute budgets based on the input. Since different images demand diff. processing & memory, how can we enable vision systems to be adaptive ? 🧵.

10

65

475

Shivam Duggal

@ShivamDuggal4

1 year

Ecstatic to share that I will be starting my Ph.D. at MIT's @MITEECS @MIT_CSAIL! Grateful to the colleagues & professors I had the pleasure of interacting with over the past few years, specially my amazing advisor @pathak2206! Will miss @CarnegieMellon & SmithHall @roboVisionCMU.

17

4

167

Shivam Duggal

@ShivamDuggal4

3 years

Glad to share my first paper at CMU!.Special thanks to my amazing advisor @pathak2206.

Deepak Pathak

@pathak2206

3 years

How can we reconstruct high-fidelity 3D from a *single* 2D image? That too w/o any 3D supervision during training?. Our new #CVPR2022 paper, TARS recovers 3D shape & correspondence from a single image. Trained on just single-view internet images w/ poses.

0

1

47

Shivam Duggal

@ShivamDuggal4

3 months

We present a Recurrent & Adaptive Tokenizer – which iteratively encodes an image into compressed representations, each iteration involving not just more processing, but also additional memory resources in form of more tokens. **Don't spend your tokens all at once!**

2

3

43

Shivam Duggal

@ShivamDuggal4

3 months

Grateful to my amazing advisors – Prof. @phillip_isola, Prof. Antonio Torralba & Prof. Bill Freeman @MITCSAIL. Excited by the direction of adaptive/dynamic representations for reasoning, video understanding etc. Paper: Code:

0

2

35

Shivam Duggal

@ShivamDuggal4

1 year

"What I cannot create, I do not understand". With the improved state of likelihood-based generative models, leveraging generative models for ("zero-shot") discriminative tasks seems promising!. Exciting collaboration with great friends, checkout @alexlioralexli's detailed thread!.

Alex Li

@alexlioralexli

1 year

Diffusion models have amazing image creation abilities. But how well does their generative knowledge transfer to discriminative tasks?. We present Diffusion Classifier: strong classification results with pretrained conditional diffusion models, *with no additional training*!. 1/9

0

1

28

Shivam Duggal

@ShivamDuggal4

2 years

Super excited and grateful to be a Siebel Scholar! 😀.Thank you @SiebelScholars, @SCSatCMU .Special thanks to my amazing advisor @pathak2206 !!.

Siebel Scholars

@SiebelScholars

2 years

We are thrilled to announce the #SiebelScholars Class of 2023. Congratulations to this exceptional community directly influencing the technologies, policies, economic and social decisions shaping our future! #SiebelClassOf2023 . Press Release:

4

1

25

Shivam Duggal

@ShivamDuggal4

3 months

**Low-Complexity Art Hypothesis (Schmidhuber, 1997) – Minimum Token Length ~ Image Complexity.**. Plot on the right depicts strong alignment (lower-triangle matrix) of human-annotated image complexity with the reconstruction loss at varied token counts.

1

24

Shivam Duggal

@ShivamDuggal4

3 months

Most exciting! At each iteration of recurrent rollout, previous latent tokens receive residual updates, while new computational memory (additional tokens) is introduced. New tokens give the existing ones the freedom to focus on specialized regions w/ sharper & sparser attention.

1

3

22

Shivam Duggal

@ShivamDuggal4

3 months

**What factors decide the required image's representational capacity ?** .We analyze – Entropy (Low-Complexity Art Hypothesis), Familiarity (IID vs OOD) and Downstream Task & Downstream Model Strength (weaker models work well with fewer tokens) – to decide the image token count.

1

18

Shivam Duggal

@ShivamDuggal4

3 months

Weaker Models can manage w/ fewer tokens!.Models w/ relatively lower performance on GT data perform just as well on low-token reconstructions as on high-token ones (eg: AlexNet, BLIP-VLM at 32 tokens / image). Knowledge determines the benefit of increasing processing time/memory!

1

2

16

Shivam Duggal

@ShivamDuggal4

3 months

Does it scale well ? **Yes**.Entire work was done on academic compute – on datasets of scale Imagenet-100, Imagenet-1K and COCO. But we study signs of scaling adaptive tokenizer to larger n/w, larger datasets (IN-100 vs IN-1K), longer training time, continuous vs discrete tokens.

1

2

15

Shivam Duggal

@ShivamDuggal4

14 days

Excited to see what the brilliant team at @tangiblerobots is up to! Best wishes to @bipashasen31 and the team as they make innovative strides in household robotics.

Tangible

@tangiblerobots

14 days

Robots can do flips and play chess, but they still can’t grab a snack or clean your table. Teleoperation is the key to unlocking real dexterity. It’s not a compromise—it’s a proven, powerful approach that combines human intuition with robotic precision. We’re not just using

0

13

Shivam Duggal

@ShivamDuggal4

2 months

@david_rolnick Reviewer’s own words: .- paper is novel, experiments are solid & comprehensive, writing is clear & concise, rebuttal resolves all questions. - I will stick to 6, borderline accept. @iclr_conf Is it fine to ask for details on why the paper is still rated 6? It is so stressful!.

3

0

11

Shivam Duggal

@ShivamDuggal4

3 months

**Dataset Representations ~ Downstream Task.**.We sample variable tokens per image using Classification Acc. or Depth Error < Thres. as Token Selection Criterion (TSC). Using just 40% of the max tokens for the dataset, we can saturate/overfit the best Top-1 Classification Acc.

1

10

Shivam Duggal

@ShivamDuggal4

2 years

Humbled to be a part of this recognition🙂.Thanks @eccvconf !.

European Conference on Computer Vision #ECCV2026

@eccvconf

2 years

List of #ECCV2022 Outstanding Reviewers. Thank you all for your service! 👏.

0

10

Shivam Duggal

@ShivamDuggal4

2 months

Distribution Matching (DMD) applied to distill a bi-directional model into a causal auto-regressive video generative model. Incredibly strong results!! Congratulations @TianweiY and team!.

Tianwei Yin

@TianweiY

2 months

CausVid trains a four-step autoregressive diffusion model to generate videos. Unlike previous bidirectional diffusion models that denoise all frames simultaneously, CausVid generates videos frame by frame. This approach enables users to watch the video while it is being

0

8

Shivam Duggal

@ShivamDuggal4

4 years

Our paper GeoSim got nominated for best paper award at CVPR'21 😃 .Special thanks to all the co-authors: @t_mux, @frieda_rong, @ShenlongWang, @skywalkeryxc, Siva, Shangjie, @meyumer and @RaquelUrtasun. Paper: Project-page:

Raquel Urtasun

@RaquelUrtasun

4 years

Our papers GeoSim ( and MP3 ( have been nominated for best paper award @CVPR. Congrats @t_mux, @frieda_rong, @ShivamDuggal4, @ShenlongWang, @skywalkeryxc, Siva, Shangjie, @meyumer, @sergioksas, Abbas!.

2

0

8

Shivam Duggal

@ShivamDuggal4

1 year

@HaoyuXiong1 It was great interacting with you, and so much fun having you around! Thanks @HaoyuXiong1 !.

0

3

Shivam Duggal

@ShivamDuggal4

1 year

@_ellisbrown @NYU_Courant @CILVRatNYU @sainingxie @rob_fergus @pathak2206 @CarnegieMellon @roboVisionCMU So exciting!! Congratulations ❤️.

0

2

Shivam Duggal

@ShivamDuggal4

1 year

@vincesitzmann @pathak2206 @MITEECS @MIT_CSAIL @CarnegieMellon @roboVisionCMU Thanks Vincent! Looking forward to interacting and working with you :D.

0

2

Shivam Duggal

@ShivamDuggal4

1 year

Super cool work, congratulations!!.

Ellis Brown

@_ellisbrown

1 year

🤖 VIRL 🌎.Grounding Virtual Intelligence In Real Life. 🧐How can we embody agents in environments as rich/diverse as those we inhabit, without real hardware & control constraints?. 🧐How can we ensure internet-trained vision/language models will translate to real life globally?.

0

1

Shivam Duggal

@ShivamDuggal4

3 months

@vincesitzmann Thank you @vincesitzmann! I am glad you liked it :D.

0

1

Shivam Duggal

@ShivamDuggal4

1 year

@anishmadan23 @MITEECS @MIT_CSAIL @pathak2206 @CarnegieMellon @roboVisionCMU Thanks @anishmadan23 ! I miss Smith Hall too 🥲.

0

1

Shivam Duggal

@ShivamDuggal4

1 year

@justachetan @MITEECS @MIT_CSAIL @pathak2206 @CarnegieMellon @roboVisionCMU Thanks Aditya! It was great meeting you this summer :D.

0

1

Shivam Duggal

@ShivamDuggal4

1 year

@servo97 @pathak2206 @MITEECS @MIT_CSAIL @CarnegieMellon @roboVisionCMU Thanks Sarvesh! :D.

0

1