Amir Zamir @zamir_ar profile

Amir Zamir

@zamir_ar

Followers

4,466

Following

634

Media

72

Statuses

867

Assistant Prof of CS, @EPFL_en Swiss Federal Institute of Technology. Previously @Berkeley_AI , @StanfordAILab , @ucf . Into #ComputerVision , #MachineLearning , #AI

https://t.co/ztK06nSJHq

Joined October 2017

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

#เศรษฐาทวีสิน • 381779 Tweets

岸田首相 • 203317 Tweets

岸田総理 • 157481 Tweets

大阪桐蔭 • 113965 Tweets

#ศาลรัฐธรรมนูญ • 98107 Tweets

岸田さん • 86404 Tweets

総理大臣 • 64364 Tweets

Grok • 54243 Tweets

新幹線運休 • 29387 Tweets

運転取りやめ • 28180 Tweets

高市さん • 22619 Tweets

#AKParti23Yaşında • 21339 Tweets

KerjaKOMPAK JKWPrabowo • 19263 Tweets

ツアー発表 • 18633 Tweets

クラスターカップ • 17072 Tweets

京都国際 • 16838 Tweets

#FurkanBölükbaşıTutuklansın • 12849 Tweets

新大阪間 • 11526 Tweets

ドンフランキー • 11302 Tweets

Nord Stream • 11074 Tweets

CHKCHKBOOM TRIPLE CROWN • 10136 Tweets

グランドフェス

トリプルクラウン

Patrick Summer In City

逆手持ち

Umudun

ユニットトレカ

菰野高校

はしちゃん

ピカレスク

SKRYU

コマンドカードおじさん

今シーズン初ヒット

도깨비춤

運転本数大幅減少

BPIP

ダンパフォ

カルナイ

ジェーン

計画運休

ドンちゃん

グランドスラム

バーチャルボーイ

浅野翔吾

連続スクイズ

#エネリッシュ

ジャクソン

浅野くん

東海道新幹線

満塁ホームラン

Last Seen Profiles

@01beysmoke

@notabutaface

@tom_belger

@dondon_01

@com_in_

@PatsBaseballUC

@MathieuDejardin

@Eskilandersson

@AlexusNotACar

@punks_gme

@realpinkbanana

@Leemctimao

@mmiintyy

@RBGmovie

@CastilloMarelen

@gssp_acc

@hoangtanlgbtgay

@Sergio_Vegas

@JustinFrankMD

@JoeySVillar

Pinned Tweet

Amir Zamir

@zamir_ar

2 months

We are releasing 4M-21 with a permissive license, including its source code and trained models. It's a pretty effective multimodal model that solves 10s of tasks & modalities. See the demo code, sample results, and the tokenizers of diverse modalities on the website. IMO, the

Amir Zamir

@zamir_ar

8 months

We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n

8

138

616

6

88

323

Amir Zamir

@zamir_ar

4 years

Salaries of PhD students, postdocs, and professors in Europe and Switzerland. Numbers are in Euro. Special attention for future students and faculties 😜

55

146

707

Amir Zamir

@zamir_ar

8 months

We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n

8

138

616

Amir Zamir

@zamir_ar

5 years

Is it a good idea to train RL policies from raw pixels? Could visual priors about the world help RL? We just released the code of our Mid-Level Vision paper addressing these questions. Spoiler: using raw pixels doesn’t generalize! Play with the results at

0

110

382

Amir Zamir

@zamir_ar

8 months

I will hire PhD students and postdocs, especially in large multimodal models and related areas.

EDIC

Computer and Communication Sciences – Our doctoral program covers all areas of computer science and of information and communication theory, from its mathematical foundations to systems, platforms,...

www.epfl.ch

Amir Zamir

@zamir_ar

8 months

We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n

8

138

616

8

41

337

Amir Zamir

@zamir_ar

2 years

We present MultiMAE at #ECCV2022 on Wed. MultiMAE is a general multi-modal & multi-task pre-training strategy based on masked autoencoders. It shows notable results in cross-modal representation learning and transfer learning. 1/5

2

45

267

Amir Zamir

@zamir_ar

2 years

I’m honored! The recognition goes to all of my collaborators, as much as it does to me. Thank you! 🙏 @eccvconf

Yizhak Ben-Shabat (Itzik) 💔

@sitzikbs

2 years

Second young researcher award. Congrats @zamir_ar

1

0

7

29

9

247

Amir Zamir

@zamir_ar

4 years

Progress on #consistency & #multitask learning. Existing methods give inconsistent results across tasks, even if joint trained. We developed a general method for learning w/ Cross-Task Consistency. It gave notable gains for anything we tried. Live #demo :

3

58

240

Amir Zamir

@zamir_ar

2 years

Happy to share that CLIPasso will be one of the best paper awards of #SIGGRAPH 2022. Congrats to the entire team! CLIP turned out to be a powerful perceptual loss.

Yael Vinker🎗

@YVinker

2 years

Very excited to share that CLIPasso was selected as one of the five Best Paper Award winners at #SIGGRAPH this year!🎨🎉🏆 It is such a great honor!! Special thanks to all my teammates @Esychology @jbo_twt @roman__bachmann @DanielCohenOr1 @bermano_h @zamir_ar and Arik Shamir

2

13

65

3

29

178

Amir Zamir

@zamir_ar

3 years

We released OMNIDATA: a pipeline for creating steerable vision datasets. It gives the user control over generating the desired dataset using real-world 3D scans. It bridges vision datasets (pre-recorded data) and simulators (online data generation). Demo:

Sasha (Alexander) Sax

@iamsashasax

3 years

Vision datasets (i.e. ImageNet) are usually collected once for a fixed task. But how do we know the choice of camera intrinsics, tasks, etc. is a good one? Our ICCV paper on “steerable datasets” addresses this problem and gets 'human-level' surface normal preds along the way (⅓)

1

9

40

5

29

157

Amir Zamir

@zamir_ar

11 months

Is it possible to adapt a neural network on the fly at the test time to cope with distribution shifts? RNA does precisely that by creating a closed-loop feedback system. We will present it on Wed afternoon at @ICCVConference . 1/n

2

21

139

Amir Zamir

@zamir_ar

5 years

Next time someone tells you reaching "human-level" at task X is the holy grail in AI, show them this video. All it takes is making the task narrow enough and there is a way to brutally outperform humans already. Being as *broad* as humans/animals is the challenge.

hardmaru

@hardmaru

5 years

Teams of high school students built bottle-flipping robots for RoboCon 2018 in Japan

34

870

3K

3

26

84

Amir Zamir

@zamir_ar

2 years

I will hire again from the Summer @EPFL program this year. Several great projects came out of S @E interns in the past, eg CLIPasso (SIGGRAPH22 best paper), Omnidata (ICCV21). Apply if our interests align. (this is for BS/MS interns. PhD visitors have another program)

EPFL Computer and Communication Sciences

@ICepfl

2 years

The Summer @EPFL 2023 application site is now open! 🎊 To apply, please visit the Summer @EPFL website: . The application deadline for all students is the Sunday closest to the 1st of December (anywhere on earth).

0

11

35

3

14

76

Amir Zamir

@zamir_ar

4 years

OpenBot is a step in the right direction. Massively scalable robotic platforms are great. I dream of an army of little (harmless!) robots running around visually exploring and making sense of the world.

OpenBot: Turning Smartphones into Robots

Current robots are either expensive or make significant compromises on sensory richness, computational power, and communication capabilities. We propose to l...

www.youtube.com

2

7

71

Amir Zamir

@zamir_ar

8 months

We'll present at NeurIPS, today at 5pm CST. Spotlight #1022 . Effectively bringing sensory modalities to large models is one way to make them more grounded, and ultimately have a more complete World Model. This is a step in that direction hopefully, and more will come.

Amir Zamir

@zamir_ar

8 months

4M exhibits having learned a solid cross-modal representation. We can use the various modalities to probe how 4M reconciles unusual inputs by manipulating one part of it while keeping the remainder fixed. (8/n)

2

4

18

1

8

71

Amir Zamir

@zamir_ar

6 years

Gibson Database of Spaces includes 572 buildings,1447 floors, and >2million ft². All real buildings scanned and #3D reconstructed. Worth a few years of human visual experience. Browse the spaces by videos & 3D cuts: #perception #robotics #dataset #vision

1

19

61

Amir Zamir

@zamir_ar

4 years

The point isn’t making big $$$ as a student/postdoc, but to live comfortably to allow focusing on research, rather than financial preoccupation. Especially if supporting a family. I think the general picture of the table remains true after considering living expense and variances

5

1

60

Amir Zamir

@zamir_ar

9 months

I will certainly hire again from the Summer @EPFL program this year.

Martin Schrimpf

@martin_schrimpf

9 months

Applications now open for the Summer @EPFL program -- 3-month fellowship for Bachelor/Master students to immerse yourself in research

6

38

154

0

2

51

Amir Zamir

@zamir_ar

5 years

Exactly 3 years ago we proposed to #CVPR with @ozansener . Today glad to see the @nature article on importance of negative results. “one of the worst aspects of science today: its toxic definitions of success”.

0

12

40

Amir Zamir

@zamir_ar

1 year

Visual odometry is a basic function for embodied AI. At #CVPR23 we will present a multi-modal & modality-invariant visual odometry framework called Visual Odometry Transformer (VOT). Also I give a talk on multi-modal learning on several projects at the Multiearth w/ on Mon. 🧵

1

8

38

Amir Zamir

@zamir_ar

1 year

Tomorrow at @CVPR , I'll give a talk about recent works on multi-modal and multi-task masked modeling for creating vision foundation models. 1:45 PM @ West 109 - 110

0

2

33

Amir Zamir

@zamir_ar

4 years

Learning with cross-task consistency was one of the #CVPR 2020 best paper award nominees. Congrats to the entire team at @StanfordAILab @berkeley_ai @ICepfl . And congrats to the winning paper by our @Oxford_VGG colleagues.

Amir Zamir

@zamir_ar

4 years

Progress on #consistency & #multitask learning. Existing methods give inconsistent results across tasks, even if joint trained. We developed a general method for learning w/ Cross-Task Consistency. It gave notable gains for anything we tried. Live #demo :

3

58

240

0

3

29

Amir Zamir

@zamir_ar

6 years

This gem never gets old. Great for a break from arxiv. It’s remarkable how much jargon education vs so little critical thinking training we receive in AI today. Watch the first minute and you’ll be sold. Science wisdom by @ProfFeynman .

0

4

28

Amir Zamir

@zamir_ar

2 years

We will present Task Discovery at #NeurIPS on Thur. Large NNs are known to fit any *training* labels. But learning from what labels would lead to *generalization*? Can we find such labels/tasks for an unlabeled dataset automatically? What would they mean?

Andrei Atanov

@andrew_atanov

2 years

What are the tasks that a neural net generalizes on? In our #NeurIPS2022 paper, we introduce a Task Discovery 🔎 framework to approach this question and automatically find such tasks. We show how such tasks look and what they reveal about NNs. 🌐 🧵1/9

1

10

52

1

26

Amir Zamir

@zamir_ar

2 years

Classical sampling-based planning algorithms in robotics (eg RRT,PRM) are efficient, performant & interpretable. Are they useful in learning-based frameworks? PALMER( #NeurIPS22 , #CoRL22 w) shows they can be effectively repurposed for learning-based frameworks & representations 🧵

1

3

25

Amir Zamir

@zamir_ar

3 years

Predecessor of color bias in ML datasets

Vox

@voxdotcom

7 years

Color film was designed for white people. Here's what it did to dark skin:

32

2K

3K

0

10

25

Amir Zamir

@zamir_ar

5 years

Gibson environment's ~600 buildings mesh rendered directly in PyBullet physics engine! FPS >5000! Great work by @erwincoumans . Check here if you want to visit inside these buildings: . Erwin's PuBullet rendering:

Gibson environment rendered directly in PyBullet

Beautiful data from http://gibsonenv.stanford.eduCan run at 5000 FPS, disable OpenGL V-Sync! When getting the camera image it is limited by GPU to CPU read-b...

www.youtube.com

Erwin Coumans 🇺🇦

@erwincoumans

5 years

>5000 FPS indoor rendering in PyBullet, using beautiful scanned assets and texture atlas from the Stanford Gibson project:

2

4

36

0

2

18

Amir Zamir

@zamir_ar

2 years

For a live demo, interactive visualizations, code, and the summary video, see . If you’re attending #ECCV2022 , come chat on Wednesday afternoon, w/ @roman__bachmann @dmizrahi_ @andrew_atanov . 5/5

1

0

17

Amir Zamir

@zamir_ar

8 months

There have been demos of “multimodal foundation model” results – but one with a demonstrable deep & broad understanding of the input like 4M’s is unprecedented. It’s not an image+text conversational model, but one that extracts a deeper understanding of the scene. (2/n)

1

2

21

Amir Zamir

@zamir_ar

5 years

If you want to see more that #turtlebot and two finger gripper arms, Jamie Paik @robotician gave a keynote talk with fun videos at #CoRL 2019 on soft robotics and intuitive interactions.

0

3

16

Amir Zamir

@zamir_ar

4 years

Tiny Images dataset (>1700 citations) was permanently taken down, due to (unintended) inclusion of inappropriate language and images, found by Prabhu&Birhane. Clearly, everything we do (and did) in computer vision is now under a bigger scrutiny magnifier!

1

0

16

Amir Zamir

@zamir_ar

3 years

The poster session of Cross-Domain Ensembles (ICCV oral) is in 1.5 hours 🙂

Teresa Yeo

@aseretys

3 years

We introduce a general approach for enforcing diversity in ensembles. It leads to notable improvements in #robustness on a wide range of tasks and datasets for #adversarial and non-adversarial shifts. Joint work with @oguzhanthefatih and @zamir_ar Website:

2

3

30

0

5

15

Amir Zamir

@zamir_ar

5 years

#Cycle -consistency by #Rumi ? “Appear as you are & Be as you appear”

1

0

15

Amir Zamir

@zamir_ar

5 years

Adversarial t-shirts are coming?

1

2

13

Amir Zamir

@zamir_ar

2 years

@zdeborova 5 mins. High level.

0

13

Amir Zamir

@zamir_ar

2 years

Via this objective, MultiMAE learns cross-modal predictive coding. The video showcases an example, where we input only depth & two RGB patches. The hue of one patch is being changed. The model propagates the colors semantically and according to depth. More examples on webpage.3/5

1

4

13

Amir Zamir

@zamir_ar

4 years

@Elnaz_AK Indeed. But I have lived in the Bay Area and Switzerland is not bad at all. And at least I can see some benefit out of my extra payments 😉

1

0

13

Amir Zamir

@zamir_ar

8 months

4M exhibits having learned a solid cross-modal representation. We can use the various modalities to probe how 4M reconciles unusual inputs by manipulating one part of it while keeping the remainder fixed. (8/n)

2

4

18

Amir Zamir

@zamir_ar

6 years

New York Times @nytimes article on home robotics, failures of the past, and (not-so-low-hanging) potentials for the future. Covered our Gibson environment too. "What Comes After the Roomba?"

1

0

11

Amir Zamir

@zamir_ar

5 years

No interaction with the world yet, but clearly some nontrivial muscle control and behavior is present. Always interesting to contemplate how much cognitive and control bias we are born with, before any learning occurs.

Terri Green

@TerriGreenUSA

5 years

@Alyssa_Milano @BrianKempGA This incredible 4D scan captured footage of what unborn fetuses do in the womb.

4

8

20

0

1

12

Amir Zamir

@zamir_ar

3 years

The Nature of Robotics exhibition at EPFL Pavilions. Catch the last two days.

EPFL Pavilions

@EPFLPavilions

3 years

Crowning a successful Nature of Robotics exhibition, EPFL Pavilions would like to invite you to a guided virtual tour with the exhibition's curator, Giulia Bini. Join us today at 6 PM on Instagram: #virtualtour #natureofrobotics #epflpavilions

0

3

1

0

11

Amir Zamir

@zamir_ar

8 months

The work is led by @dmizrahi_ & @roman__bachmann , along with @oguzhanthefatih , @aseretys , Mingfei Gao, @aligarjani , David Griffiths, @hujm99 , @afshin_dn , @zamir_ar at @EPFL_en & @Apple . We'll present at NeurIPS, Wed at 5-7pm CST (spotlight #1022 ) 🌐 (n/n)

2

11

Amir Zamir

@zamir_ar

3 years

Very true for research.

Ⓥ 🌱 🐮 🇨🇦 🇺🇦

@tbCvh863cMspPa

3 years

Craving attention interferes with creativity

0

1

0

10

Amir Zamir

@zamir_ar

8 months

4M trains a single Transformer jointly on many diverse modalities. The key to making it scalable was relying on tokenization to remove modality-specific intricacies, then masking tokens from both the inputs and targets to encourage multimodal fusion & improve efficiency. (3/n)

1

2

15

Amir Zamir

@zamir_ar

8 months

4M can perform compositional generation by weighting different conditions by different amounts, even negatively. This allows the user to control precisely how strongly or weakly a generated output should follow each condition. (9/n)

1

2

12

Amir Zamir

@zamir_ar

1 year

@francoisfleuret Goes well in the collection of littles

0

1

8

Amir Zamir

@zamir_ar

5 years

@andrey_kurenkov ImageNet pretraining doesn't work well if the task isn't based on object semantics (eg monocular 3D) or images aren't from internet users (ie Flickr, instagram, etc style). See taskonomy analysis & the works that apply ImageNet models on images coming from robot onboard cameras.

1

2

7

Amir Zamir

@zamir_ar

10 months

@docmilanfar I empathize. Such itemized recipes exist because they’re tempting (to both the speaker and the audience). We like them because following them would provide a tangible path to greatness. We don’t want to believe often there isn’t any; and the lists are usually over generalization.

1

0

7

Amir Zamir

@zamir_ar

8 months

Through controlled ablations, we found that increasing the number of pre-training tasks generally improves transfer performance, got insights into the masking strategy, and observed promising scaling trends in terms of dataset and model size. (13/n)

1

10

Amir Zamir

@zamir_ar

2 years

MultiMAE has a simple and efficient pre-training objective: mask out a large number of patches from multiple input modalities, and learn to reconstruct them from the remaining information. 2/5

1

0

8

Amir Zamir

@zamir_ar

4 years

@colinraffel ImageNet performance is not a full representation of “learning from limited labeled data” though. The trends on other tasks (eg single image 3D) don’t quite hold up. There seem to be some ImageNet/object classification overfitting in methodologies.

1

0

8

Amir Zamir

@zamir_ar

8 months

4M has multimodal retrieval capabilities, by adding global embeddings of models like DINOv2 or ImageBind to the set of 4M modalities, that were not possible with the original networks. 4M effectively distilled contrastive models using a more generative objective. (11/n)

1

2

11

Amir Zamir

@zamir_ar

8 months

We trained 4M on different kinds of image, semantic, and geometric metadata extracted from the pseudo labels, enabling a high degree of control over the generation process and strong potential for steerable data generation. (10/n)

1

2

10

Amir Zamir

@zamir_ar

5 years

What?? According to the Supreme Court of the United States “Using copyrighted material in a dataset that is used to train a discriminative machine-learning algorithm is perfectly legal”

1

3

7

Amir Zamir

@zamir_ar

8 months

Besides the out-of-the-box capabilities, a 4M model can also be directly used as a ViT backbone. It exhibits strong transfer performance by outperforming MAE and MultiMAE on various standard vision benchmarks. (12/n)

1

9

Amir Zamir

@zamir_ar

8 months

4M models can output any of the modalities conditioned on any other(s). To do that, we iteratively predict and sample tokens then add them back to the input. Once all tokens from a modality are predicted, we move on to the next modality. (5/n)

1

10

Amir Zamir

@zamir_ar

8 months

We would need a large and diverse multimodal dataset to train such a model. Existing datasets are either too small or not diverse enough, so we instead start from image & text pairs then use off-the-shelf pseudo-labeling networks to generate the remaining modalities. (4/n)

1

10

Amir Zamir

@zamir_ar

6 years

Though the unicorn of robotics might well be at a supermarket, construction site, or a warehouse—rather than a home. Related to the recent @nytimes article by @markoff , the piece on "The Hunt for Robot Unicorns" by @IEEESpectrum was a good read too

0

2

7

Amir Zamir

@zamir_ar

4 years

@Michael_J_Black @docmilanfar Agreed. I often tell students we have letters because some critical information is lost in common metrics and standardized tests (GPA, paper/citation count, school name, etc). That’s their purpose and they should serve it however it makes sense. Good for avoding survivorship bias

0

7

Amir Zamir

@zamir_ar

8 months

4M’s any-to-any generation and in-painting capabilities enable fine-grained multimodal generation and editing tasks. Such as performing semantic edits or grounding the generation in extracted intermediate modalities. (7/n)

1

2

8

Amir Zamir

@zamir_ar

4 years

Technology adoption in US households, 1860 to 2019.

0

6

Amir Zamir

@zamir_ar

8 months

This approach makes it convenient to add new modalities from diverse formats (e.g. images, sequences, neural network feature maps, etc). We already trained models that can jointly operate on 20+ modalities/tasks and are adding more. (6/n)

1

2

11

Amir Zamir

@zamir_ar

4 years

The method basically augments standard supervised learning objective w/ explicit cross-task consistency constraints. The constraints are learned from data; no need for differentiable or apriori known constraints. We start with a consistent "triangle" and extend to larger graphs.

1

2

5

Amir Zamir

@zamir_ar

2 years

MultiMAE is trained *entirely using pseudo labels*, making it applicable to any RGB dataset without any annotations. It can be flexibly transferred to tasks where more than just one modality is (optionally and arbitrarily) available, with notable performance benefits. 4/5

1

0

6

Amir Zamir

@zamir_ar

2 years

WTF! He demanded an non-Muslim American journalist to wear a headscarf — in New York!!

The Daily Show

@TheDailyShow

2 years

"I was not in that moment as a journalist or a woman going to put a headscarf on and somehow bind myself." CNN's @amanpour on refusing to wear a headscarf for her interview with Iran's president Ebrahim Raisi

527

2K

7K

1

0

6

Amir Zamir

@zamir_ar

6 years

@tsimonite @SergeBelongie @nisselson The conclusion that simulation-to-reality gap is about to disappear is shortsighted, IMO. The biggest obstacle #sim2real faces is not photorealistic rendering, but matching the semantic complexity of real world in simulation. Good luck creating a full messy bedroom in simulation.

0

1

5

Amir Zamir

@zamir_ar

4 years

Cross-Task Consistency is quite useful for standard single-task learning too, not just multitask. Simple conclusion: instead of training your network to do X→Y1, train it to do X→Y1→Y2. It will fit the data better with improved Y1 predictions. We extend this to larger configs.

1

5

Amir Zamir

@zamir_ar

2 years

@y_m_asano @jalayrac @mcaron31 @NagraniArsha @imisra_ Great talks! Looking forward to seeing the recordings 😉

0

5

Amir Zamir

@zamir_ar

4 years

@MattNiessner I see. Unsurprising. There is a disproportional focus on fixing the diversity issue close to the end of the pipeline (PhD student level, postdoc level, faculty level). That's way too late. Mostly fixes the cosmetics only. We need to start much earlier.

1

0

5

Amir Zamir

@zamir_ar

4 years

@AjdDavison Well, just like with many other things, scaling up is one big issue🙂In terms of both scene size and the required density of images. I won’t be surprised if scaling brings in some of the classic mechanisms that are written off now. But things are moving fast in this space, so...

0

5

Amir Zamir

@zamir_ar

3 years

@vincesitzmann @MIT Congrats!! Long way from Bytes meetings 👏🍾

0

4

Amir Zamir

@zamir_ar

2 years

@wenzeljakob @merlin_ND @DelioVicini Congrats! Beautiful work done.

0

4

Amir Zamir

@zamir_ar

5 years

I always wondered why living organisms didn’t grow 'wheels' through evolution.

0

1

4

Amir Zamir

@zamir_ar

5 years

@abigail_e_see @skynet_today 2. clickbait titles/pictures: they are probably the fastest way to get traffic but fast doesn’t mean good. Concise and descriptive > catchy and inaccurate. Be a responsible journalist/blogger/presenter, even if it costs you getting less attention in short run.

0

1

4

Amir Zamir

@zamir_ar

3 years

@JesParent @bradpwyble @ArtDeza @StateoftheartAI @iamsashasax @jitendramalik28 @silviocinguetta First time looking at taskonomy’s citation graph, thanks! 😁 BTW, taskonomy images are real, not synthetic. There are a lot “synthetic” datasets out there, with a wide range. Eg synthesized from real scans/data (eg Gibson Env or IBR), from artists’ CAD, or others like fractals.

1

4

Amir Zamir

@zamir_ar

5 years

@SHamidRezatofig @CVPR @cvpr2020

0

4

Amir Zamir

@zamir_ar

11 months

Using a closed-loop formulation is common in control theory and robotics for solving (hard) problems. RNA uses a side controller network (h) to interpret a feedback signal to adapt a given pre-trained network (f). It is implemented via inserting FiLM layers in f. 2/n

1

3

Amir Zamir

@zamir_ar

4 years

@fdellaert Depends if the primary goal is looks/communication or there are more functions

1

0

3

Amir Zamir

@zamir_ar

11 months

We experimented with a set of signals that are practical for real-world use. However, those signals are also imperfect, so in the paper we also perform controlled experiments using ideal signals to isolate the actual performance of RNA. 5/n

1

3

Amir Zamir

@zamir_ar

6 years

@fchollet Could be foveated, instead of hierarchical, too. At minimum certain parts of biological perception prefers fovea over an explicit hierarchy.

0

3

Amir Zamir

@zamir_ar

3 years

@TrackingActions @ICepfl They were great exams and discussions! Credit goes to Roman and Onur for the job ;)

1

0

3

Amir Zamir

@zamir_ar

11 months

The experiments are on several tasks, eg depth, semantic segmentation, 3D reconstruction, ImageNet, & on a range of distribution shifts. We also provide a discussion on the landscape of related formulations. Joint w/ @aseretys , @oguzhanthefatih , Zahra n/n

0

1

3

Amir Zamir

@zamir_ar

4 years

@zacharylipton @IBM What’s the “AI” in there? I read multiple articles (by @IBM & others) and this seems like a database integration mostly. The fact that they keep shoving the word “AI” in it to get attention and turn it into a PR campain is extra alarming if this really benefits the less fortunate

2

0

2

Amir Zamir

@zamir_ar

4 years

@igubins It was just a random 0.25% sample of the full training dataset. The goal was to evaluate whether the trends hold under a low data regime too. We didn't think about putting the sample indexes on Github. We could. I believe any iid random sample would do.

1

3

Amir Zamir

@zamir_ar

4 years

@colinraffel Talk titles are even more amazing!! "Learning Internal Reps From Multiple Tasks", "Identifying Relevant Tasks", "Where is Multitask Learning Useful?", "Combining supervised and unsupervised learning, where do we go from here?", "Continual Learning"

0

3

Amir Zamir

@zamir_ar

2 years

@jinayoon_ Maybe add europe and the rest of the world besides north america? 😉

0

1

3

Amir Zamir

@zamir_ar

6 years

Joint work with @xf1280 @ZhiyangH @iamsashasax @silviocinguetta , Jitendra Malik at @StanfordCVGL and @berkeley_ai .

0

3

Amir Zamir

@zamir_ar

11 months

The side network h has ~5-20% of the number of parameters of f. It is trained to predict how f should be updated -- so it amortizes the optimization (takes only a feedforward pass), making it much (~30x) faster than performing test-time optimization using SGD (TTO). 3/n

1

3

Amir Zamir

@zamir_ar

2 years

@georgiagkioxari @jbhuang0604 And off-the-shelf single-view 3D methods aren’t too bad either

1

0

3

Amir Zamir

@zamir_ar

4 years

paper by @vinayprabhu & @Abebab

1

0

3

Amir Zamir

@zamir_ar

5 years

@andrey_kurenkov @Bschulz5 @elonmusk @skynet_today Scaling is easier than inventing. If we know how to make AGI, likely 2xAGI or 10xAGI is quick, so @elonmusk might be right on that. But the missing piece rn is the G in AGI. And I suspect we're inconceivably far from it. Otherwise nX human-level is already here for narrow tasks.

1

3

Amir Zamir

@zamir_ar

4 years

Those that found inaccuracies in the table according to your experinece, consider directly reporting the error to the source to update their stats: administration @informatics -europe.org. I sent them an email inviting them to look at the reported inaccuracies in this thread.

0

3

Amir Zamir

@zamir_ar

4 years

@_herobotics_ @fdellaert Gibson website is a good one. Especially the dataset page

0

3

Amir Zamir

@zamir_ar

6 years

@hardmaru @erwincoumans A (quantitative) answer to the generalization question through a study is brewing. Sneak peak: perception and dynamics aspects should be viewed and analyzed separately wrt generalization. Their generalization traits don't appear to correlate strongly. (opportunity or threat?)

0

2

Amir Zamir

@zamir_ar

4 years

Now, phone cases with tiny legs!

0

2

Amir Zamir

@zamir_ar

6 years

@AlexandreRbcqt @Stanford @StanfordCVGL 🙏 😊

0

2

Amir Zamir

@zamir_ar

4 years

@LauTor83 @ArnoutDevos brought that up, and I reported the error to the source a bit ago. My phd students dont quite get that total amount too, but seems like the reported numbers for all countries is higher (eg comments about Germany). Some tax/employment rate adjustment might be in play?

0

2

Amir Zamir

@zamir_ar

1 year

Very cute! Is it still necessary to capture (many) pixels/photons given powerful generative models? 🧵

Bjørn Karmann

@BjoernKarmann

1 year

Introducing – Paragraphica! 📡📷 A camera that takes photos using location data. It describes the place you are at and then converts it into an AI-generated "photo". See more here: or try to take your own photo here:

1K

5K

23K

1

0

2