SAM 2 is the next generation of the Segment Anything Model for images we released last year! SAM 2 comes with all the great features from SAM (promptable, zero shot generalization, fast inference, Apache 2.0 license), but now also for video! Here's what's in SAM 2🧵👇
Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos.
SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences
Details ➡️
4.5 years ago I started my first AI Research project
@MetaAI
. I didn’t have a PhD or research experience. But I was confident it was the right decision. Life often goes by without celebrating the small wins — here’s to taking bold bets and intentionally choosing harder paths 🎉
Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos.
SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences
Details ➡️
Thrilled to share that Segment Anything was awarded with "Best Paper: Honorable Mention" at
#ICCV2023
today, one of the top 3 papers out of 8,260 submissions & 2,161 accepted! It's been incredible to see the tremendous impact of SAM in the research community & for
@Meta
products!
Thrilled to announce that Segment Anything has been accepted to
#ICCV2023
! Co-leading this project with
@kirillov_a_n
for the past ~1.5 years and getting to work with a rockstar team has been the most enjoyable experience in my time
@MetaAI
.
Today we're releasing the Segment Anything Model (SAM) — a step toward the first foundation model for image segmentation.
SAM is capable of one-click segmentation of any object from any photo or video + zero-shot transfer to other segmentation tasks ➡️
✈️Excited to attend
#CVPR
in Seattle! My team at Meta (FAIR) is hiring Research Scientists & Engineers to work on multimodal models across image/video. *Full-time* roles only starting ASAP in NYC/Bay Area/Seattle. Stop by the Meta booth or DM me to chat! 😀
📢 My team
@AIatMeta
(FAIR) is hiring full-time Research Engineers & Scientists interested in working on foundation models for computer vision and language! If you're at
@ICCVConference
in Paris this week, DM me to chat! Work with us to build the next Segment Anything! 🚀🇫🇷
Today at
@Meta
Connect, Mark announced "Backdrop", a new AI editing feature coming soon to
@instagram
powered by the Segment Anything Model (SAM) my team
@AIatMeta
open-sourced this year! Thrilled to see our research brought to life for billions of people in an app I love!
“I don’t have data I only have opinions”😂
@georgiagkioxari
’s awesome no holds barred talk on AI research in Industry vs Academia at the “Scholars & Big Models” workshop
@CVPR
. TL;DR space of impactful problems is huge and both groups have their respective roles in advancing AI.
Honoured to receive the
@RAEngNews
Young Engineer award earlier this year! Humbled to be able to fulfill a generational career journey from my mum being an engineer at
@ATT
@BellLabs
in the mid 80s, to my role today in helping advance the state of the art in AI at
@AIatMeta
🤩
Five early-career engineers received the RAEng Engineers Trust Young Engineer of the Year award this year. Get to know awardee
@nikhilaravi
, who received £3,000 for her ground-breaking
#AI
work with
@Meta
. Watch on to find out more:
#RAEngAwards
So fun trying to generate Bollywood style music using the MusicGen demo from my colleagues at FAIR! 🇮🇳🎶
"Bollywood music track, traditional Indian sounds of melodic sitar and tabla, blended with modern pop sounds, catchy melody, and infectious rhythms"
We present MusicGen: A simple and controllable music generation model. MusicGen can be prompted by both text and melody.
We release code (MIT) and models (CC-BY NC) for open research, reproducibility, and for the music community:
Awesome turnout at our poster session today
@CVPR
with
@jcjohnss
(missed you
@georgiagkioxari
!). Non stop crowd for 3 hours! Lots of great questions and discussions! Read more about the paper on the project website
#CVPR2022
Stop by the
@AIatMeta
booth at
@ICCVConference
this afternoon to chat, grab a Segment Anything sticker and try a bunch of demos including SAM, FACET, ImageBind, Dino & more!
After stepping into a Research Engineering Manager role, I've gained a deeper empathy and respect for my managers! Fostering a culture and relationship where 100% transparency, awkward conversations and constant two-way feedback is the norm is no easy feat and requires great care
One thing that I started doing at OpenAI is that I created a policy for myself to be *100% transparent* with my manager about everything. It seems obvious and weird to say aloud, but I bet most people don’t actually do this. But once I started doing it, I realized there are a lot
The Segment Anything Model (SAM) by Meta AI is a step toward the first foundation model for image segmentation. SAM is capable of one-click segmentation of any object from photos or videos + zero-shot transfer to other segmentation tasks.
Try the demo ➡️
Enjoyed this conversation with
@swyx
and
@josephofiowa
on the journey from SAM to SAM 2 and beyond! We talk about model design, building data engines, innovating on demos and going from research to real world applications!
Our SAM 2 pod with
@nikhilaravi
is out! Fun SAM1 quote from guest cohost
@josephofiowa
:
"I recently pulled statistics from the usage of SAM in
@RoboFlow
over the course of the last year. And users have labeled about 49 million images using SAM on the hosted side of the RoboFlow
One of the best parts of my job
@MetaAI
is getting to work on Open Source! With Segment Anything it's been incredible to see the pace of integration into products and applications to science! Excited for what people build with LLama 2 with the research & commercial use license!
Llama 2 is a step forward for commercially available language models and open innovation in AI. These new models were pretrained on 2T tokens, and have double the context length when compared to the original release of Llama.
Download Llama 2 ➡️
Getting wished on your “Metaversary” has got to be one of the most unique things about working at
@Meta
😂! 6 years have flow by! Forever grateful for amazing colleagues, mentors and opportunities to advance the state of the art with
@MetaAI
! Throwback to Day
#1
!
🤨How can industry and academic research be evaluated on a more even playing field?
💰
@jon_barron
’s suggestion
@CVPR
: introduce benchmarks that normalize accuracy by cost — maximize(accuracy/watts) as a metric that’s achievable for all researchers irrespective of compute access.
Spoiled by 2D, I was shocked to find out there are no good ways to compute exact IoU of oriented 3D boxes. So, we came up with a new algorithm which is exact, simple, efficient and batched. Naturally, we have C++ and CUDA support
#PyTorch3D
Read more:
Ridiculous that The Queue at
#Wimbledon
still has paper tickets and a 6 hr wait. Simple tech solutions would go a long way: add QR codes to the queue card to pay online, send text updates on the status/wait time, create a virtual queue instead! Grateful the sun was out 😎☀️
Basing the immigration system on country of birth exacerbates the problem. I was born in India but grew up entirely in the UK and am a UK citizen, but ended up in the India queue. The only reason I got a green card was through my work in AI research! The system needs to be fixed!
one of the easiest policy wins i can imagine for the US is to reform high-skill immigration.
the fact that many of the most talented people in the world want to be here is a hard-won gift; embracing them is the key to keeping it that way.
hard to get this back if we lose it.
📢 Excited to announce FACET from
@MetaAI
, a new comprehensive benchmark for evaluating the fairness of computer vision models across different demographics.
(1/11) 🧵👇
10 years of FAIR.
10 years of advancing the state of the art in AI through open research.
We're celebrating the 10th anniversary of Meta's Fundamental AI Research team and continuing that legacy by sharing our work on three exciting new research projects today.
Details below 🧵
Super excited to share PyTorch3D,
@facebookai
's new library for 3D deep learning research providing:
- Easy batching of heterogeneous meshes
- Optimized common 3D operators
- Modular, differentiable mesh renderering
Try the code and tutorials on GitHub:
If you're still at CVPR today check out our paper "Omnivore: A Single Model for Many Visual Modalities".
Our SoTA model excels at classifying images, videos, and single-view 3D data using exactly the same model parameters and without access to correspondences between modalities.
Inaugural Desi Tech Mafia dinner in San Francisco for
#indianindependenceday
🇮🇳 co-hosted with
@ravirajjain
and
@lightspeedvp
! Awesome conversation with this group of VCs, founders and researchers in AI! Here’s to more serendipitous connections and creating community! 🤩
At the “Scholars & Big Models” workshop
@CVPR
@jon_barron
putting AI progress in perspective. All technology progresses like a sigmoid — people overestimate the rate and scale of change. Look at airplanes! Most important question to ask as a researcher: which sigmoid to choose?
Loved representing
@AIatMeta
for a
#WHCD
event with
@haddadmedia
in Washington DC, showcasing our research on Segment Anything, the Backdrop feature on Instagram and talking about all things open source!
FAIR researchers (
@AIatMeta
) presented SegmentAnything and our robotics work at the White House correspondents’ weekend.
Llama3 + Sim2Real skills (trained with
@ai_habitat
) = a robot assistant
My 86-year-old grandma’s book launch is happening tomorrow! ❤️ If you’re in Coimbatore or have friends in the area who’d like to join, please share! See you there!
UPDATE!
Join us for a wonderful evening at the launch of "Two Loves & Other Stories" by Smt. Balam Sundaresan at Ardra Hall in Coimbatore, on the 3rd of January!
Looking forward to seeing many of you there!
#launch
#books
#garuda
#coimbatore
To close out 2023, here are 10 of the most interesting AI research advancements we shared on our feed this year — and where you can find more details on the work.
1️⃣ Segment Anything (SAM)
A step toward the first foundation model for image segmentation.
Details:
Curious to see use cases for SAM on CPU on a Macbook! ~1.9s for embedding extraction per image on Apple M2 Ultra with the Vit-B model (vs ~0.15 seconds on an NVIDIA A100 GPU) and ~45ms per prompt for mask prediction (similar to to ONNX in browser inference using multithreading).
📢 My team
@MetaAI
is hiring a Full Stack Eng! ChatGPT & Segment Anything showed the world the vital role of exceptional UX in making AI models accessible & widely adopted! Join us to build demos for the next breakthrough in computer vision!
🔗Apply here:
AI is the future, be it Artificial Intelligence or America-India! Our nations are stronger together, our planet is better when we work in collaboration.
Awesome panel with
@lightspeedvp
,
@eladgil
,
@hwchase17
,
@hliriani
,
@lisabethhan
🏗️ Build amid rapid change
🗑️ Discard bad ideas promptly
🔍 Find the whitespace in AI apps
🛡️ Consider defenses against incumbents -- what if MSFT turns on this feature?
🎯 No GPUs before PMF
📢 Looking for interesting real world integrations of Segment Anything -- could be use cases you've seen in big companies, start ups, for social good, or anything else! Please share in the comments or DM me! 🙏
Object recognition should happen in 3D because the world is 3D.
On the heels of Mesh R-CNN, we build a model that detects all objects, predicts their 3D position in space (layout) and their 3D shape *from a single image of a novel complex scene*.
At CVPR 2022.
(1/n)
One of the best parts of building computer vision models is making cool visuals of the outputs! We love taking this to the next level in the SAM team to build interactive demos where anyone can use the model in the real world! Try it out and share your favourite creations!
You should check out the demo, it uses some really cool tricks to stream the detection bouding boxes from the model so that you have an interactive experience. It turns out that ML models with great demos have 100x real world impact than those without.
The State of AI report is out! Enjoyed helping Nathan & team review this year’s edition! It provides a thoughtful & well-rounded overview of the most important trends from research & industry and their influence on safety & policy globally. A must-read for anyone interested in AI
🪩The
@stateofaireport
2023 is now here.
Our 6th installment is one of the most exciting years I can remember. The
#stateofai
report covers everything you *need* to know, covering research, industry, safety and politics.
There’s lots in there, so here’s my director’s cut 🧵
Inspiring talk by
@sarameghanbeery
@WiMLworkshop
at
#NeurIPS2023
on her journey from professional ballerina to MIT AI professor, how interdisciplinary research e.g AI for biodiversity can be very impactful, and why free food at events is critical for attracting newcomers to AI 😀
Another example here of SAM V2 with literally 1 mouse on the first frame of the video and it tracked Simone even through that super fast motion part at the end
If I were a VFX shop I'd be implementing this asap. It's better than any traditional ROTO I've ever used
Join us for our tutorial on Implicit Rendering for Novel View Synthesis using Implicitron and PyTorch3D at
#ECCV2022
! In person, Monday 10/24, 9am-12:30 Israel D. Research talks & coding with
@davnov134
@ovrdr
, Jeremy Reizenstein & lots of P3D stickers!
#GenerativeAI
x
#Bollywood
is here 🤩 What would Gerua sound like if it was sung by Atif Aslam? How about an Atif x KK x Arjit Collab? Love these creative AI generated Bollywood song covers by
@djmrasingh
👏🏽
Great coverage in
@techreview
of our work FACET from
@MetaAI
! With vision capabilities being introduced in foundation models like GPT4-V, benchmarking fairness across different demographics and person related attributes will be increasingly important.
Key message
@CVPR
Transformers for Vision workshop panel — the importance of data and model co-design. Working on data, although tough, is crucial and rewarding. Everyone loves modeling, but often the real power is in getting the data right. Don't overlook it!
Backdrop has launched in the US! 🎉 So fun to see Segment Anything and Emu powered features in
@instagram
! 😍 Awesome example of research at
@AIatMeta
enabling new product experiences! Try it out!
Awesome Womxn in AI dinner and discussion on APIs vs open source, AI applications and the next step change with
@Redpoint
@molwelch
@ericabrescia
@achowdhery
and many other inspiring researchers & founders.
Also future home goals to have a secret den behind a bookshelf 🤩
Great conversation with
@lisabethhan
&
@ajratner
on data engines, model <> data co-evolution, SAM, best practices for data centric AI, leveraging OSS foundation models, evaluating for robustness, and how enterprises are adopting and adapting these tools for their use cases/data
We’re honored to share that Meta AI researchers received four different publication awards at ACL this week — including three outstanding paper recognitions!
4️⃣ more, award-winning papers to read from Meta AI at
#ACL2023NLP
🧵
Alternative views on "Big Models vs Scholars" by
@deviparikh
@CVPR
:
🛠️Think about big models as infra: find ways to control and use them as tools and get info out of them in reliable and safe ways.
🚧 Embrace constraints: what can be done with just 8 GPUS?
👇🏽+ many more ideas!
Some tips:
- share specific details of your research experience and areas of expertise when you reach out
- look through previous papers from
@AIatMeta
on computer vision
- please don't ask about internships or referrals for another position!
If the UK needs more informed, tech-savvy decision makers,
@SciTechgovuk
should leverage British AI expertise in Silicon Valley. We came out here to learn but want the opportunity to contribute valuable insights back home for shaping the UK's AI strategy! 🇬🇧🇺🇸
“We lack competence and confidence at the heart of government,” says one adviser. “The people who run compute policy in the Department for Science, Innovation and Technology really just don’t understand it. They don’t understand the difference between general and specific
📣 Excited to announce my 86-year-young grandma's debut book, a side project close to my heart! 🌟 From a random lockdown convo, to cold emails to Indian publishers, editing, polishing and receiving the final book, it's been so fun to act as agent & editor! Grab your copy!👇🏽
NEW ARRIVAL!
We are happy to announce the launch of "Two Loves & Other Stories", a magical collection of short stories by the wonderfully talented 86-year-young Smt. Balam Sundaresan! ✨
The author skillfully weaves a tapestry of Tamilian lives, presenting a diverse range of
Excited to be supporting
@WiCVworkshop
@ICCVConference
next week in Paris on behalf of
@MetaAI
! 🇫🇷 I'll be giving a talk about Segment Anything and Laura will be presenting our work on the FACET benchmark for fairness and bias!
Congrats to the authors of the other Award winning papers: (Marr Prize) "Adding Conditional control to Text-Image Diffusion Models" (a.k.a ControlNet) & "Passive Ultra-wideband Single-Photon Imaging", (Best Student Paper) "Tracking Everything Everywhere All at Once"!
Top cited publications over the past 5 years according to Google Scholar:
1. Nature
2. New England Journal of Medicine
3. Science
4. ✨
@CVPR
✨
Truly impactful and revolutionary work happening in this computer vision research community!
Really enjoyed attending the
@WiCVworkshop
as a mentor on behalf of
@MetaAI
! Great conversations with peers and mentees and friends I made from the last in person CVPR
@NagraniArsha
!
#WiCV
Snapshots from our social dinner event
@CVPR
We would like to thank our keynote speakers, all mentors and mentees for attending and for sharing their experiences!
Had a blast helping judge the
@pearvc
hackathon! Incredible creativity and live demos, like Relay AI's phone agent for small businesses —ask for a discount, it can negotiate! Also, En Passant's custom character commentary for e-sports, complete with a Gordon Ramsay example! 🤩
@pearvc
x
@OpenAI
organized a hackathon for the best SF builders to expand the limitations of AI
It was one of the hardest hackathons to get in, ensuring the high quality of participants
Here is what the finalists have built (🧵)
📢📢📢 We have released data, models and training code for Omni3D.
If you are into large-scale, generalizable solutions in the intersection of recognition and 3D, Omni3D is your testbed
✅ 234k images
✅ 3 million objects
✅ 97 object types
sourced from existing 3D datasets!
As a part of their preparations for the CWC Qualifier and T20WC Americas Qualifier,
@usacricket
have announced 📢
🇺🇸 A 28-member Women’s National Training Group
🇺🇸 A 24-member Women’s National U19 Training Group for the first time ever
In 2021, we created a research demo that brought amateur drawings to life through animation — today, we're open-sourcing the code + releasing a first-of-its-kind dataset of nearly 180K annotated amateur drawings to help researchers keep innovating in this space.
More details ⬇️
👋 Hey
#CVPR2023
! We’re here in Vancouver and excited to share new work, showcase demos and participate in interesting conversations this week.
📍 Booth
#1602
A 🧵of things to look out for from the Meta AI team ⬇️
1/6
Witnessed my first SF car break in while sitting on a bench in Alamo Square: 2pm on a Sunday, broad daylight on a busy street bordering the park. A red car pulled up, a guy got out, smashed in the window of a parked car, pulled out a bag, drove off. All over in a few mins 🫣
The barrier to entry for building new AI products is lower than ever. But making a ChatGPT for your workplace is much more than just your data + vector DB + LLM.
@mr_cheu
,
@jainarvind
and I dive deep into how Glean Chat works and how it was built -
The 28 Player USA Women’s National Training Group.
A huge year for
#TeamUSA
🇺🇸 Women, with the ICC Women’s World Cup Qualifier in Sri Lanka in July followed by the ICC Women’s T20 World Cup Americas Qualifier in September, which USA will host.🙌🏏
MORE➡️: