![MMLab@NTU Profile](https://pbs.twimg.com/profile_images/1743099693293531136/hy4nGcTC_x96.jpg)
MMLab@NTU
@MMLabNTU
Followers
1K
Following
190
Statuses
67
Multimedia Laboratory @NTUsg, affiliated with S-Lab. Computer Vision, Image Processing, Computer Graphics, Deep Learning
Singapore
Joined May 2021
RT @TheAITalksOrg: The Upcoming AI talk: 🌋LLaVA🦙 A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li @ChunyuanLi…
0
14
0
EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM 🔗 Project page: 🔗 GitHub: 🤗 Hugging Face:
🚀 Excited to share our latest work: "EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM" Supercharged SAM for Edge Devices! 🌟 #EdgeSAM is a faster, optimized version of SAM, now tailored for edge devices. We've reimagined SAM's ViT-based image encoder into a CNN architecture, perfect for these devices. Our unique approach includes distilling the prompt encoder & mask decoder, ensuring our model grasps the complex dynamics of user input & mask generation. 🏎️ EdgeSAM boasts a 40x speed boost over SAM & outperforms MobileSAM, being 14x faster on edge devices. Plus, it enhances mIoUs on COCO & LVIS by 2.3 and 3.2! First SAM variant to run over 30 FPS on iPhone 14. Check out our code, demo & models! 🔗 Project page: 🔗 GitHub: 🤗 Hugging Face: Together with @ChongZhou7, @xtl994 and @doubledaibo
0
0
8
RT @liuziwei7: 🔥🔥We are excited to announce #Vchitect, an open-source project for video generative models @huggingface 📽️LaVie (Text2Vide…
0
336
0
RT @ccloy: @chaseleantj Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. Code and model: https…
0
4
0
Free lunch this way 👇
FreeU: Free Lunch in Diffusion U-Net paper page: we uncover the untapped potential of diffusion U-Net, which serves as a "free lunch" that substantially improves the generation quality on the fly. We initially investigate the key contributions of the U-Net architecture to the denoising process and identify that its main backbone primarily contributes to denoising, whereas its skip connections mainly introduce high-frequency features into the decoder module, causing the network to overlook the backbone semantics. Capitalizing on this discovery, we propose a simple yet effective method-termed "FreeU" - that enhances generation quality without additional training or finetuning. Our key insight is to strategically re-weight the contributions sourced from the U-Net's skip connections and backbone feature maps, to leverage the strengths of both components of the U-Net architecture. Promising results on image and video generation tasks demonstrate that our FreeU can be readily integrated to existing diffusion models, e.g., Stable Diffusion, DreamBooth, ModelScope, Rerender and ReVersion, to improve the generation quality with only a few lines of code.
0
2
21
RT @liuziwei7: Excited to see that our new 🦦Otter🦦 model "OTTER-Image-MPT7B" ranks 🔥top🔥 on several large multimodal model evaluation bench…
0
31
0
RT @liuziwei7: Thrilled to announce **Otter**, a multi-modal in-context learning model with instruction tuning: 1) Chatbot w/ image, video…
0
124
0
🌟🎉 We're thrilled to announce that we have 14 papers accepted at #CVPR2023, including 3 highlights & 1 award candidate! 🏆 A big thank you to our amazing collaborators! 🤝 🔗 Check out our papers here: 🏅 Award candidate:
The twelve #CVPR2023 award candidate papers are listed at Congratulations to all of the authors on this achievement!
0
12
88