Amir Zamir Profile Banner
Amir Zamir Profile
Amir Zamir

@zamir_ar

Followers
4,466
Following
634
Media
72
Statuses
867
Explore trending content on Musk Viewer
Pinned Tweet
@zamir_ar
Amir Zamir
2 months
We are releasing 4M-21 with a permissive license, including its source code and trained models. It's a pretty effective multimodal model that solves 10s of tasks & modalities. See the demo code, sample results, and the tokenizers of diverse modalities on the website. IMO, the
@zamir_ar
Amir Zamir
8 months
We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n
Tweet media one
8
138
616
6
88
323
@zamir_ar
Amir Zamir
4 years
Salaries of PhD students, postdocs, and professors in Europe and Switzerland. Numbers are in Euro. Special attention for future students and faculties 😜
Tweet media one
55
146
707
@zamir_ar
Amir Zamir
8 months
We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n
Tweet media one
8
138
616
@zamir_ar
Amir Zamir
5 years
Is it a good idea to train RL policies from raw pixels? Could visual priors about the world help RL? We just released the code of our Mid-Level Vision paper addressing these questions. Spoiler: using raw pixels doesn’t generalize! Play with the results at
0
110
382
@zamir_ar
Amir Zamir
8 months
I will hire PhD students and postdocs, especially in large multimodal models and related areas.
@zamir_ar
Amir Zamir
8 months
We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by @EPFL_en & @Apple . 4M: Massively Multimodal Masked Modeling 🌐 🧵1/n
Tweet media one
8
138
616
8
41
337
@zamir_ar
Amir Zamir
2 years
We present MultiMAE at #ECCV2022 on Wed. MultiMAE is a general multi-modal & multi-task pre-training strategy based on masked autoencoders. It shows notable results in cross-modal representation learning and transfer learning. 1/5
2
45
267
@zamir_ar
Amir Zamir
2 years
I’m honored! The recognition goes to all of my collaborators, as much as it does to me. Thank you! 🙏 @eccvconf
@sitzikbs
Yizhak Ben-Shabat (Itzik) 💔
2 years
Second young researcher award. Congrats @zamir_ar
Tweet media one
1
0
7
29
9
247
@zamir_ar
Amir Zamir
4 years
Progress on #consistency & #multitask learning. Existing methods give inconsistent results across tasks, even if joint trained. We developed a general method for learning w/ Cross-Task Consistency. It gave notable gains for anything we tried. Live #demo :
3
58
240
@zamir_ar
Amir Zamir
2 years
Happy to share that CLIPasso will be one of the best paper awards of #SIGGRAPH 2022. Congrats to the entire team! CLIP turned out to be a powerful perceptual loss.
Tweet media one
@YVinker
Yael Vinker🎗
2 years
Very excited to share that CLIPasso was selected as one of the five Best Paper Award winners at #SIGGRAPH this year!🎨🎉🏆 It is such a great honor!! Special thanks to all my teammates @Esychology @jbo_twt @roman__bachmann @DanielCohenOr1 @bermano_h @zamir_ar and Arik Shamir
2
13
65
3
29
178
@zamir_ar
Amir Zamir
3 years
We released OMNIDATA: a pipeline for creating steerable vision datasets. It gives the user control over generating the desired dataset using real-world 3D scans. It bridges vision datasets (pre-recorded data) and simulators (online data generation). Demo:
@iamsashasax
Sasha (Alexander) Sax
3 years
Vision datasets (i.e. ImageNet) are usually collected once for a fixed task. But how do we know the choice of camera intrinsics, tasks, etc. is a good one? Our ICCV paper on “steerable datasets” addresses this problem and gets 'human-level' surface normal preds along the way (⅓)
1
9
40
5
29
157
@zamir_ar
Amir Zamir
11 months
Is it possible to adapt a neural network on the fly at the test time to cope with distribution shifts? RNA does precisely that by creating a closed-loop feedback system. We will present it on Wed afternoon at @ICCVConference . 1/n
2
21
139
@zamir_ar
Amir Zamir
5 years
Next time someone tells you reaching "human-level" at task X is the holy grail in AI, show them this video. All it takes is making the task narrow enough and there is a way to brutally outperform humans already. Being as *broad* as humans/animals is the challenge.
@hardmaru
hardmaru
5 years
Teams of high school students built bottle-flipping robots for RoboCon 2018 in Japan
34
870
3K
3
26
84
@zamir_ar
Amir Zamir
2 years
I will hire again from the Summer @EPFL program this year. Several great projects came out of S @E interns in the past, eg CLIPasso (SIGGRAPH22 best paper), Omnidata (ICCV21). Apply if our interests align. (this is for BS/MS interns. PhD visitors have another program)
@ICepfl
EPFL Computer and Communication Sciences
2 years
The Summer @EPFL 2023 application site is now open! 🎊 To apply, please visit the Summer @EPFL website: . The application deadline for all students is the Sunday closest to the 1st of December (anywhere on earth).
Tweet media one
0
11
35
3
14
76
@zamir_ar
Amir Zamir
4 years
OpenBot is a step in the right direction. Massively scalable robotic platforms are great. I dream of an army of little (harmless!) robots running around visually exploring and making sense of the world.
2
7
71
@zamir_ar
Amir Zamir
8 months
We'll present at NeurIPS, today at 5pm CST. Spotlight #1022 . Effectively bringing sensory modalities to large models is one way to make them more grounded, and ultimately have a more complete World Model. This is a step in that direction hopefully, and more will come.
@zamir_ar
Amir Zamir
8 months
4M exhibits having learned a solid cross-modal representation. We can use the various modalities to probe how 4M reconciles unusual inputs by manipulating one part of it while keeping the remainder fixed. (8/n)
2
4
18
1
8
71
@zamir_ar
Amir Zamir
6 years
Gibson Database of Spaces includes 572 buildings,1447 floors, and >2million ft². All real buildings scanned and #3D reconstructed. Worth a few years of human visual experience. Browse the spaces by videos & 3D cuts: #perception #robotics #dataset #vision
Tweet media one
1
19
61
@zamir_ar
Amir Zamir
4 years
The point isn’t making big $$$ as a student/postdoc, but to live comfortably to allow focusing on research, rather than financial preoccupation. Especially if supporting a family. I think the general picture of the table remains true after considering living expense and variances
5
1
60
@zamir_ar
Amir Zamir
9 months
I will certainly hire again from the Summer @EPFL program this year.
@martin_schrimpf
Martin Schrimpf
9 months
Applications now open for the Summer @EPFL program -- 3-month fellowship for Bachelor/Master students to immerse yourself in research
Tweet media one
6
38
154
0
2
51
@zamir_ar
Amir Zamir
5 years
Exactly 3 years ago we proposed to #CVPR with @ozansener . Today glad to see the @nature article on importance of negative results. “one of the worst aspects of science today: its toxic definitions of success”.
0
12
40
@zamir_ar
Amir Zamir
1 year
Visual odometry is a basic function for embodied AI. At #CVPR23 we will present a multi-modal & modality-invariant visual odometry framework called Visual Odometry Transformer (VOT). Also I give a talk on multi-modal learning on several projects at the Multiearth w/ on Mon. 🧵
1
8
38
@zamir_ar
Amir Zamir
1 year
Tomorrow at @CVPR , I'll give a talk about recent works on multi-modal and multi-task masked modeling for creating vision foundation models. 1:45 PM @ West 109 - 110
Tweet media one
0
2
33
@zamir_ar
Amir Zamir
4 years
Learning with cross-task consistency was one of the #CVPR 2020 best paper award nominees. Congrats to the entire team at  @StanfordAILab @berkeley_ai   @ICepfl . And congrats to the winning paper by our  @Oxford_VGG colleagues.
@zamir_ar
Amir Zamir
4 years
Progress on #consistency & #multitask learning. Existing methods give inconsistent results across tasks, even if joint trained. We developed a general method for learning w/ Cross-Task Consistency. It gave notable gains for anything we tried. Live #demo :
3
58
240
0
3
29
@zamir_ar
Amir Zamir
6 years
This gem never gets old. Great for a break from arxiv. It’s remarkable how much jargon education vs so little critical thinking training we receive in AI today. Watch the first minute and you’ll be sold. Science wisdom by @ProfFeynman .
0
4
28
@zamir_ar
Amir Zamir
2 years
We will present Task Discovery at #NeurIPS on Thur. Large NNs are known to fit any *training* labels. But learning from what labels would lead to *generalization*? Can we find such labels/tasks for an unlabeled dataset automatically? What would they mean?
Tweet media one
@andrew_atanov
Andrei Atanov
2 years
What are the tasks that a neural net generalizes on? In our #NeurIPS2022 paper, we introduce a Task Discovery 🔎 framework to approach this question and automatically find such tasks. We show how such tasks look and what they reveal about NNs. 🌐 🧵1/9
Tweet media one
1
10
52
1
1
26
@zamir_ar
Amir Zamir
2 years
Classical sampling-based planning algorithms in robotics (eg RRT,PRM) are efficient, performant & interpretable. Are they useful in learning-based frameworks? PALMER( #NeurIPS22 , #CoRL22 w) shows they can be effectively repurposed for learning-based frameworks & representations 🧵
1
3
25
@zamir_ar
Amir Zamir
3 years
Predecessor of color bias in ML datasets
@voxdotcom
Vox
7 years
Color film was designed for white people. Here's what it did to dark skin:
32
2K
3K
0
10
25
@zamir_ar
Amir Zamir
5 years
Gibson environment's ~600 buildings mesh rendered directly in PyBullet physics engine! FPS >5000! Great work by @erwincoumans . Check here if you want to visit inside these buildings: . Erwin's PuBullet rendering:
@erwincoumans
Erwin Coumans 🇺🇦
5 years
>5000 FPS indoor rendering in PyBullet, using beautiful scanned assets and texture atlas from the Stanford Gibson project:
2
4
36
0
2
18
@zamir_ar
Amir Zamir
2 years
For a live demo, interactive visualizations, code, and the summary video, see . If you’re attending #ECCV2022 , come chat on Wednesday afternoon, w/ @roman__bachmann @dmizrahi_ @andrew_atanov . 5/5
Tweet media one
1
0
17
@zamir_ar
Amir Zamir
8 months
There have been demos of “multimodal foundation model” results – but one with a demonstrable deep & broad understanding of the input like 4M’s is unprecedented. It’s not an image+text conversational model, but one that extracts a deeper understanding of the scene. (2/n)
Tweet media one
1
2
21
@zamir_ar
Amir Zamir
5 years
If you want to see more that #turtlebot and two finger gripper arms, Jamie Paik @robotician gave a keynote talk with fun videos at #CoRL 2019 on soft robotics and intuitive interactions.
0
3
16
@zamir_ar
Amir Zamir
4 years
Tiny Images dataset (>1700 citations) was permanently taken down, due to (unintended) inclusion of inappropriate language and images, found by Prabhu&Birhane. Clearly, everything we do (and did) in computer vision is now under a bigger scrutiny magnifier!
Tweet media one
1
0
16
@zamir_ar
Amir Zamir
3 years
The poster session of Cross-Domain Ensembles (ICCV oral) is in 1.5 hours 🙂
@aseretys
Teresa Yeo
3 years
We introduce a general approach for enforcing diversity in ensembles. It leads to notable improvements in #robustness on a wide range of tasks and datasets for #adversarial and non-adversarial shifts. Joint work with @oguzhanthefatih and @zamir_ar Website:
2
3
30
0
5
15
@zamir_ar
Amir Zamir
5 years
#Cycle -consistency by #Rumi ? “Appear as you are & Be as you appear”
Tweet media one
1
0
15
@zamir_ar
Amir Zamir
5 years
Adversarial t-shirts are coming?
1
2
13
@zamir_ar
Amir Zamir
2 years
@zdeborova 5 mins. High level.
0
0
13
@zamir_ar
Amir Zamir
2 years
Via this objective, MultiMAE learns cross-modal predictive coding. The video showcases an example, where we input only depth & two RGB patches. The hue of one patch is being changed. The model propagates the colors semantically and according to depth. More examples on webpage.3/5
1
4
13
@zamir_ar
Amir Zamir
4 years
@Elnaz_AK Indeed. But I have lived in the Bay Area and Switzerland is not bad at all. And at least I can see some benefit out of my extra payments 😉
1
0
13
@zamir_ar
Amir Zamir
8 months
4M exhibits having learned a solid cross-modal representation. We can use the various modalities to probe how 4M reconciles unusual inputs by manipulating one part of it while keeping the remainder fixed. (8/n)
2
4
18
@zamir_ar
Amir Zamir
6 years
New York Times @nytimes article on home robotics, failures of the past, and (not-so-low-hanging) potentials for the future. Covered our Gibson environment too. "What Comes After the Roomba?"
1
0
11
@zamir_ar
Amir Zamir
5 years
No interaction with the world yet, but clearly some nontrivial muscle control and behavior is present. Always interesting to contemplate how much cognitive and control bias we are born with, before any learning occurs.
@TerriGreenUSA
Terri Green
5 years
@Alyssa_Milano @BrianKempGA This incredible 4D scan captured footage of what unborn fetuses do in the womb.
4
8
20
0
1
12
@zamir_ar
Amir Zamir
3 years
The Nature of Robotics exhibition at EPFL Pavilions. Catch the last two days.
Tweet media one
@EPFLPavilions
EPFL Pavilions
3 years
Crowning a successful Nature of Robotics exhibition, EPFL Pavilions would like to invite you to a guided virtual tour with the exhibition's curator, Giulia Bini. Join us today at 6 PM on Instagram: #virtualtour #natureofrobotics #epflpavilions
Tweet media one
0
0
3
1
0
11
@zamir_ar
Amir Zamir
8 months
The work is led by @dmizrahi_ & @roman__bachmann , along with @oguzhanthefatih , @aseretys , Mingfei Gao, @aligarjani , David Griffiths, @hujm99 , @afshin_dn , @zamir_ar at @EPFL_en & @Apple . We'll present at NeurIPS, Wed at 5-7pm CST (spotlight #1022 ) 🌐 (n/n)
2
2
11
@zamir_ar
Amir Zamir
3 years
Very true for research.
@tbCvh863cMspPa
Ⓥ 🌱 🐮 🇨🇦 🇺🇦
3 years
Craving attention interferes with creativity
0
0
1
0
0
10
@zamir_ar
Amir Zamir
8 months
4M trains a single Transformer jointly on many diverse modalities. The key to making it scalable was relying on tokenization to remove modality-specific intricacies, then masking tokens from both the inputs and targets to encourage multimodal fusion & improve efficiency. (3/n)
1
2
15
@zamir_ar
Amir Zamir
8 months
4M can perform compositional generation by weighting different conditions by different amounts, even negatively. This allows the user to control precisely how strongly or weakly a generated output should follow each condition. (9/n)
1
2
12
@zamir_ar
Amir Zamir
1 year
@francoisfleuret Goes well in the collection of littles
Tweet media one
0
1
8
@zamir_ar
Amir Zamir
5 years
@andrey_kurenkov ImageNet pretraining doesn't work well if the task isn't based on object semantics (eg monocular 3D) or images aren't from internet users (ie Flickr, instagram, etc style). See taskonomy analysis & the works that apply ImageNet models on images coming from robot onboard cameras.
1
2
7
@zamir_ar
Amir Zamir
10 months
@docmilanfar I empathize. Such itemized recipes exist because they’re tempting (to both the speaker and the audience). We like them because following them would provide a tangible path to greatness. We don’t want to believe often there isn’t any; and the lists are usually over generalization.
1
0
7
@zamir_ar
Amir Zamir
8 months
Through controlled ablations, we found that increasing the number of pre-training tasks generally improves transfer performance, got insights into the masking strategy, and observed promising scaling trends in terms of dataset and model size. (13/n)
Tweet media one
1
1
10
@zamir_ar
Amir Zamir
2 years
MultiMAE has a simple and efficient pre-training objective: mask out a large number of patches from multiple input modalities, and learn to reconstruct them from the remaining information. 2/5
Tweet media one
Tweet media two
Tweet media three
1
0
8
@zamir_ar
Amir Zamir
4 years
@colinraffel ImageNet performance is not a full representation of “learning from limited labeled data” though. The trends on other tasks (eg single image 3D) don’t quite hold up. There seem to be some ImageNet/object classification overfitting in methodologies.
1
0
8
@zamir_ar
Amir Zamir
8 months
4M has multimodal retrieval capabilities, by adding global embeddings of models like DINOv2 or ImageBind to the set of 4M modalities, that were not possible with the original networks. 4M effectively distilled contrastive models using a more generative objective. (11/n)
Tweet media one
1
2
11
@zamir_ar
Amir Zamir
8 months
We trained 4M on different kinds of image, semantic, and geometric metadata extracted from the pseudo labels, enabling a high degree of control over the generation process and strong potential for steerable data generation. (10/n)
1
2
10
@zamir_ar
Amir Zamir
5 years
What?? According to the Supreme Court of the United States “Using copyrighted material in a dataset that is used to train a discriminative machine-learning algorithm is perfectly legal”
1
3
7
@zamir_ar
Amir Zamir
8 months
Besides the out-of-the-box capabilities, a 4M model can also be directly used as a ViT backbone. It exhibits strong transfer performance by outperforming MAE and MultiMAE on various standard vision benchmarks. (12/n)
Tweet media one
1
1
9
@zamir_ar
Amir Zamir
8 months
4M models can output any of the modalities conditioned on any other(s). To do that, we iteratively predict and sample tokens then add them back to the input. Once all tokens from a modality are predicted, we move on to the next modality. (5/n)
1
1
10
@zamir_ar
Amir Zamir
8 months
We would need a large and diverse multimodal dataset to train such a model. Existing datasets are either too small or not diverse enough, so we instead start from image & text pairs then use off-the-shelf pseudo-labeling networks to generate the remaining modalities. (4/n)
Tweet media one
1
1
10
@zamir_ar
Amir Zamir
6 years
Though the unicorn of robotics might well be at a supermarket, construction site, or a warehouse—rather than a home. Related to the recent @nytimes article by @markoff , the piece on "The Hunt for Robot Unicorns" by @IEEESpectrum was a good read too
0
2
7
@zamir_ar
Amir Zamir
4 years
@Michael_J_Black @docmilanfar Agreed. I often tell students we have letters because some critical information is lost in common metrics and standardized tests (GPA, paper/citation count, school name, etc). That’s their purpose and they should serve it however it makes sense. Good for avoding survivorship bias
0
0
7
@zamir_ar
Amir Zamir
8 months
4M’s any-to-any generation and in-painting capabilities enable fine-grained multimodal generation and editing tasks. Such as performing semantic edits or grounding the generation in extracted intermediate modalities. (7/n)
Tweet media one
Tweet media two
1
2
8
@zamir_ar
Amir Zamir
4 years
Technology adoption in US households, 1860 to 2019.
Tweet media one
0
0
6
@zamir_ar
Amir Zamir
8 months
This approach makes it convenient to add new modalities from diverse formats (e.g. images, sequences, neural network feature maps, etc). We already trained models that can jointly operate on 20+ modalities/tasks and are adding more. (6/n)
Tweet media one
1
2
11
@zamir_ar
Amir Zamir
4 years
The method basically augments standard supervised learning objective w/ explicit cross-task consistency constraints. The constraints are learned from data; no need for differentiable or apriori known constraints. We start with a consistent "triangle" and extend to larger graphs.
Tweet media one
1
2
5
@zamir_ar
Amir Zamir
2 years
MultiMAE is trained *entirely using pseudo labels*, making it applicable to any RGB dataset without any annotations. It can be flexibly transferred to tasks where more than just one modality is (optionally and arbitrarily) available, with notable performance benefits. 4/5
Tweet media one
Tweet media two
1
0
6
@zamir_ar
Amir Zamir
2 years
WTF! He demanded an non-Muslim American journalist to wear a headscarf — in New York!!
@TheDailyShow
The Daily Show
2 years
"I was not in that moment as a journalist or a woman going to put a headscarf on and somehow bind myself." CNN's @amanpour on refusing to wear a headscarf for her interview with Iran's president Ebrahim Raisi
527
2K
7K
1
0
6
@zamir_ar
Amir Zamir
6 years
@tsimonite @SergeBelongie @nisselson The conclusion that simulation-to-reality gap is about to disappear is shortsighted, IMO. The biggest obstacle #sim2real faces is not photorealistic rendering, but matching the semantic complexity of real world in simulation. Good luck creating a full messy bedroom in simulation.
0
1
5
@zamir_ar
Amir Zamir
4 years
Cross-Task Consistency is quite useful for standard single-task learning too, not just multitask. Simple conclusion: instead of training your network to do X→Y1, train it to do X→Y1→Y2. It will fit the data better with improved Y1 predictions. We extend this to larger configs.
Tweet media one
1
1
5
@zamir_ar
Amir Zamir
2 years
@y_m_asano @jalayrac @mcaron31 @NagraniArsha @imisra_ Great talks! Looking forward to seeing the recordings 😉
0
0
5
@zamir_ar
Amir Zamir
4 years
@MattNiessner I see. Unsurprising. There is a disproportional focus on fixing the diversity issue close to the end of the pipeline (PhD student level, postdoc level, faculty level). That's way too late. Mostly fixes the cosmetics only. We need to start much earlier.
1
0
5
@zamir_ar
Amir Zamir
4 years
@AjdDavison Well, just like with many other things, scaling up is one big issue🙂In terms of both scene size and the required density of images. I won’t be surprised if scaling brings in some of the classic mechanisms that are written off now. But things are moving fast in this space, so...
0
0
5
@zamir_ar
Amir Zamir
3 years
@vincesitzmann @MIT Congrats!! Long way from Bytes meetings 👏🍾
0
0
4
@zamir_ar
Amir Zamir
2 years
@wenzeljakob @merlin_ND @DelioVicini Congrats! Beautiful work done.
0
0
4
@zamir_ar
Amir Zamir
5 years
I always wondered why living organisms didn’t grow 'wheels' through evolution.
Tweet media one
0
1
4
@zamir_ar
Amir Zamir
5 years
@abigail_e_see @skynet_today 2. clickbait titles/pictures: they are probably the fastest way to get traffic but fast doesn’t mean good. Concise and descriptive > catchy and inaccurate. Be a responsible journalist/blogger/presenter, even if it costs you getting less attention in short run.
0
1
4
@zamir_ar
Amir Zamir
3 years
@JesParent @bradpwyble @ArtDeza @StateoftheartAI @iamsashasax @jitendramalik28 @silviocinguetta First time looking at taskonomy’s citation graph, thanks! 😁 BTW, taskonomy images are real, not synthetic. There are a lot “synthetic” datasets out there, with a wide range. Eg synthesized from real scans/data (eg Gibson Env or IBR), from artists’ CAD, or others like fractals.
1
1
4
@zamir_ar
Amir Zamir
11 months
Using a closed-loop formulation is common in control theory and robotics for solving (hard) problems. RNA uses a side controller network (h) to interpret a feedback signal to adapt a given pre-trained network (f). It is implemented via inserting FiLM layers in f. 2/n
1
1
3
@zamir_ar
Amir Zamir
4 years
@fdellaert Depends if the primary goal is looks/communication or there are more functions
1
0
3
@zamir_ar
Amir Zamir
11 months
We experimented with a set of signals that are practical for real-world use. However, those signals are also imperfect, so in the paper we also perform controlled experiments using ideal signals to isolate the actual performance of RNA. 5/n
1
1
3
@zamir_ar
Amir Zamir
6 years
@fchollet Could be foveated, instead of hierarchical, too. At minimum certain parts of biological perception prefers fovea over an explicit hierarchy.
0
0
3
@zamir_ar
Amir Zamir
3 years
@TrackingActions @ICepfl They were great exams and discussions! Credit goes to Roman and Onur for the job ;)
1
0
3
@zamir_ar
Amir Zamir
11 months
The experiments are on several tasks, eg depth, semantic segmentation, 3D reconstruction, ImageNet, & on a range of distribution shifts. We also provide a discussion on the landscape of related formulations. Joint w/ @aseretys , @oguzhanthefatih , Zahra n/n
Tweet media one
0
1
3
@zamir_ar
Amir Zamir
4 years
@zacharylipton @IBM What’s the “AI” in there? I read multiple articles (by @IBM & others) and this seems like a database integration mostly. The fact that they keep shoving the word “AI” in it to get attention and turn it into a PR campain is extra alarming if this really benefits the less fortunate
2
0
2
@zamir_ar
Amir Zamir
4 years
@igubins It was just a random 0.25% sample of the full training dataset. The goal was to evaluate whether the trends hold under a low data regime too. We didn't think about putting the sample indexes on Github. We could. I believe any iid random sample would do.
1
1
3
@zamir_ar
Amir Zamir
4 years
@colinraffel Talk titles are even more amazing!! "Learning Internal Reps From Multiple Tasks", "Identifying Relevant Tasks", "Where is Multitask Learning Useful?", "Combining supervised and unsupervised learning, where do we go from here?", "Continual Learning"
0
0
3
@zamir_ar
Amir Zamir
2 years
@jinayoon_ Maybe add europe and the rest of the world besides north america? 😉
0
1
3
@zamir_ar
Amir Zamir
11 months
The side network h has ~5-20% of the number of parameters of f. It is trained to predict how f should be updated -- so it amortizes the optimization (takes only a feedforward pass), making it much (~30x) faster than performing test-time optimization using SGD (TTO). 3/n
1
1
3
@zamir_ar
Amir Zamir
2 years
@georgiagkioxari @jbhuang0604 And off-the-shelf single-view 3D methods aren’t too bad either
Tweet media one
1
0
3
@zamir_ar
Amir Zamir
4 years
1
0
3
@zamir_ar
Amir Zamir
5 years
@andrey_kurenkov @Bschulz5 @elonmusk @skynet_today Scaling is easier than inventing. If we know how to make AGI, likely 2xAGI or 10xAGI is quick, so @elonmusk might be right on that. But the missing piece rn is the G in AGI. And I suspect we're inconceivably far from it. Otherwise nX human-level is already here for narrow tasks.
1
1
3
@zamir_ar
Amir Zamir
4 years
Those that found inaccuracies in the table according to your experinece, consider directly reporting the error to the source to update their stats: administration @informatics -europe.org. I sent them an email inviting them to look at the reported inaccuracies in this thread.
0
0
3
@zamir_ar
Amir Zamir
4 years
@_herobotics_ @fdellaert Gibson website is a good one. Especially the dataset page
0
0
3
@zamir_ar
Amir Zamir
6 years
@hardmaru @erwincoumans A (quantitative) answer to the generalization question through a study is brewing. Sneak peak: perception and dynamics aspects should be viewed and analyzed separately wrt generalization. Their generalization traits don't appear to correlate strongly. (opportunity or threat?)
0
0
2
@zamir_ar
Amir Zamir
4 years
Now, phone cases with tiny legs!
0
0
2
@zamir_ar
Amir Zamir
4 years
@LauTor83 @ArnoutDevos brought that up, and I reported the error to the source a bit ago. My phd students dont quite get that total amount too, but seems like the reported numbers for all countries is higher (eg comments about Germany). Some tax/employment rate adjustment might be in play?
0
0
2
@zamir_ar
Amir Zamir
1 year
Very cute! Is it still necessary to capture (many) pixels/photons given powerful generative models? 🧵
Tweet media one
Tweet media two
@BjoernKarmann
Bjørn Karmann
1 year
Introducing – Paragraphica! 📡📷 A camera that takes photos using location data. It describes the place you are at and then converts it into an AI-generated "photo". See more here: or try to take your own photo here:
1K
5K
23K
1
0
2