apolinario 🌐
@multimodalart
Followers
12K
Following
4K
Media
824
Statuses
3K
ML for Art and Creativity, working @HuggingFace ([email protected])
Joined July 2021
I hacked @huggingface Spaces to build an open source @gradio Dreambooth Training UI that allows you to train a model for less than US$0.80 🐱💻 (you can also use it locally for free):
28
110
811
My favorite part is that it works really well with out-of-the-distribution garments
17
84
801
Editing facial expressions in real time now on @huggingface Spaces 👨🎤🔀. A Grog converted Cog image to Gradio running a ComfyUI backend - magic of open source 🤝. ▶️
10
138
786
After some, uh, developments yesterday:.- Stable Diffusion v1-5 is out by @runwayml.- Fine-tuned image decoder (VAE) out by @StabilityAI. Magic of open source🧙 collaboration continues no matter what, here's the Best Available Stable Diffusion™ notebook:
9
99
615
IC-Light v2 was just released by @lvminzhang 🔦, now runs on FLUX, and it is the best relighting tool in the world 🌐, just like that. Try out the official demo ✨📣
14
92
525
Thanks @angrypenguinPNG for merging my PR to add high resolution to the Illusion Diffusion Space 📺🌀 . It's now as fast, double the resolution and has crispy details - go play ▶️ .
18
94
509
ControlNet is cool, but what if you could have MORE control? 🤯 . With MultiDiffusion Region Control you can 🎛️ draw masks ✏️ and give a specific prompt for each mask 📜. The @gradio demo is just out on @huggingface 🤗 - kudos to the author @omerbartal!
9
101
440
How to train your own ControlNet? 🥅. We wrote a guide, ranging from deciding which controls to use 🎛️, how to prepare your dataset, all the way to gpus going brrr 🔥 .(with an unexpected trip to the uncanny valley 👀). From me and @pcuenq with ❤️.
9
91
414
Iterated with @angrypenguinPNG on some enhancements to their Illusion Diffusion Space, @MrUgleh-inspired QR ControlNet patterns 🌀. ▶️
12
59
387
The first large scale open source DALL-E 2 replication is here🧙. Karlo is an unCLIP model trained by #KakaoBrain. I'm having fun playing with it on 🤗 @huggingface Spaces: Model card: GitHub:
12
77
370
The Stable Diffusion Multi Inpainting Spaces is out!. On it you can do both: Inpainting by masking the image (with the newest @Gradio masking) or inpainting with words, your choice!.
7
64
348
🧨 diffusers 0.5.0 now supports JAX for super fast #stablediffusion inference on TPUs. You can generate 8 images in ~8s on Colab Free using TPU 🚀.
2
77
345
The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔. It is HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨. In the paper they claim to be SOTA open source! I'm working on a @huggingface demo
13
78
347
I'm super thrilled to announce that our assemble of the Latent Diffusion LAION-400M text-to-image model is now available on @huggingface🤗, democratizing even further the access to text-to-image ai art!. Thank you for all the help @osanseviero!
11
77
343
I'm delighted to announce I've joined @huggingface as a ML Art Engineer 🤗, to help make AI art even more accessible, easy to use and to develop for!. This tech is going to empower human expression and creativity in unprecedented ways - and building it openly feels the right way!
29
29
343
The MarioGPT @huggingface Spaces demo is now playable! 🕹️. Now you can play the levels you generate - hopefully you're better than me 😂
7
58
310
Diffusers Outpaint now allows for infinite zoom-out with a resize input size + "use as input" button. @kingnish24 🤝 @fffiloni . ▶️
7
42
289
Collaborative new concepts on #StableDiffusion🎨. 1. Teach Stable Diffusion new concepts 👩🏫(add to the public library if you wish): (or browse the library to pick one🧤 . 2. Run with the learned concepts 🖼️
4
56
277
Stable Diffusion 2 by @StabilityAI is out with new 5 models 👽. You can try now the 768x768 model (the largest one released) on @huggingface Spaces
9
44
272
Happy Public Domain day! 🎉 . To celebrate Steamboat Willie finally joining the public domain, I created a @huggingface dataset with all frames of the 1928 short 🐭📜. ▶️
7
48
264
Breaking news: OpenAI open sourced their CLIP ViT-L/14@336px! I'll hook it soon to many generation systems, stay tuned!.
5
32
261
The official ToonCrafter demo is now available @huggingface Spaces ZeroGPU 🤯. This generative cartoon interpolation model is by far the best coherent generative interpolation model I've seen 🖼️ . IMO it will change how animations are made 🪄. ▶️
3
63
262
Ok - I just quickly assembled the LAION-400M trained Latent Diffusion CFG TTI model to a Google Colab, you can try it yourself: "A mecha robot holding a sign that reads: 'This is weird'"
Very exciting 'breaking' news! . CompVis (research group behind VQGAN) have just released a new 1.45B parameter model to its Latent Diffusion model: From the released image it seems like it has an unprecedented text-synthesis capacity. More to follow soon
36
49
245
Stable Diffusion model card is up, and the weights are available for academic and research purposes first. This is the first step ahead of a full public release which should be coming soon! 🤩 #StableDiffusion
4
50
246
OPEN TO EVERYBODY! . I optimized the Latent Diffusion LAION-400M Colab RAM usage and now it should run on free non-Pro accounts. And fast!.8 images in 20 seconds on a P4 GPU. Google Drive support and VRAM optimizations by @RiversHaveWings were also added
Ok - I just quickly assembled the LAION-400M trained Latent Diffusion CFG TTI model to a Google Colab, you can try it yourself: "A mecha robot holding a sign that reads: 'This is weird'"
21
44
239
Stable Video Diffusion is an amazing (and chonky 🐼) new model by @StabilityAI - if you can't run it locally, you can now play with it on @huggingface Spaces 🤗. ▶️
3
41
247
Guess who's back? Back again! 🎵 @StabilityAI is back, tell a friend 🎤. Stable Diffusion 3.5 Large is here 🔥.- 🏋️ 8B parameters.- Full 💪 and 🏎️💨 4-step Turbo variant.- 🧾 🤝 commercial use (for orgs below 1M year/rev).- 🧨 day-0 LoRA fine-tuning support
9
47
238
Following the full open source release of Stable Diffusion, the @huggingface Spaces for it is out🤗. Stable Diffusion is a state-of-the-art text-to-image model that was released today by @StabilityAI #stablediffusion .
4
61
229
InstructPix2Pix by Tim Brooks allows you to write natural language instructions to edit images ✏️🖼️ . We are getting closer and closer to "photoshop with words"! 🎨. Play with it now on @huggingface Spaces
5
40
217
Since VQGAN+CLIP times, we've been learning to prompt with @openai CLIP knowledge (incl. SDv1, conditioned on OAI CLIP). Stable Diffusion 2 breaks that 💥 with LAION-trained CLIP, "trending on artstation", "greg rutkowski" don't work; we're all learning to prompt again! 👶.
13
23
214
✨ PD12M ✨, a 12.4 million high quality image-caption dataset for AI training 🎛️, featuring:. - 🤖✏️ Florence-2 synthetic captions .- 🌸 Aesthetic and safety filtered from 34M superset .- 🔓 only public domain images. superb release by @spawning_ .
6
43
215
ComfyUI → @huggingface Spaces → serverless ZeroGPU ✨😌. We wrote a tutorial on how to turn any ComfyUI workflow into an easy to use Gradio app and (optionally) host it for free with ZeroGPU 💥.
3
37
217
PAG (Perturbed-Attention Guidance) is not getting nearly the attention it deserves, I've adapted it to work on SDXL with diffusers 🧨. and it DELIVERS! 🤯 . Try it here ▶️.. thanks to KU-CVLAB researchers: Donghoon Ahn Hyoungwon Cho et. al ❤️
Recent studies reveal that the quality of samples from diffusion models relies on techniques like CG and CFG, yet these fall short in unconditional generation and tasks like image restoration. This research paper introduces Perturbed-Attention Guidance (PAG), a novel method
9
52
195
with the amazing @bfl_ml FLUX Tools released yesterday, . maybe you missed the release of the first IP Adapter for FLUX [dev] by @instantx_ai! ⭐️. (and it's actually amazing). You can try the official demo here 🎨🔧.
3
37
194
HUGE update from @StabilityAI ✨. Stable Diffusion 3 🖼️, Stable Video Diffusion 🎥, SDXL Turbo 💨 and more can now be used commercially without a subscription by anyone with less than US$1M annual revenue 🔓 💥.
2
39
186
AuraSR an open source replication of @Adobe's GigaGAN super-resolution by @FAL 🔥. 🤏 600M params.💥 4x upscaling .🖼️ Excels at adding sharpness/fine details to mid-sized images.🔓 Commercially permissive license (cc-by-sa). ▶️
Time for some Aura. First in our series of fully open sourced / commercially available models: AuraSR - a 600M parameter upscaler based on GigaGAN. Blog: HF: Code: Playground:
2
50
187
CogVideoX just released the weights for its 5B model! 🎥 ✨. It's the best open weights text-to-video model - competitive with Runway / Luma / Pika. With 🧨@diffuserslib, it fits on < 10GB VRAM 🤏. (ah, and they changed the smaller 2B model license to Apache 2.0 🔥)
5
45
174
Pyramid Flow 𓁿 was announced today as a high quality 2B 🤏 text-to-video and image-to-video model 🎥. You can now try it by yourself on a @huggingface Gradio demo ꔮ. ▶️
3
42
180
Releasing my Vintage Ads LoRA 📰📮 . Based on public domain ads from old magazines . Model and demo: Dataset: . (trained on @FAL and migrated to the @huggingface Hub using their native tooling 🤗)
5
29
171
🚨 A new text to image model by @StabilityAI is out!. It's Stable Cascade 💧 an iteration on the Würstchen architecture by @dome_271 & @pabloppp . I made a demo for it:
20
40
167
Introducing ✨ LoRA Studio ✨ a dedicated UI by @enzostvs for LoRAs hosted on @huggingface 🤗 browse and generate images with fun models 🎉 . (and safe models, no need to worry if your mom or your colleague enter the room while you are browsing 😳 🔞). ▶️
12
40
164
Tencent releases PhotoMaker v2!.better identity fidelity 🤝 better controllability . it works really good when multiple images of the subject are uploaded! kudos to @yshan2u @zhenli1031!. demo: code:
5
33
163
New model alert! 🚨 ⋆✴︎˚。FLEX.1 Alpha ˚。✴︎⋆ is an 8B parameter model pruned and further trained by @ostrisai from 12B FLUX.1 [schnell]:. 🖼️ High quality, competitive with FLUX[dev].🎨 Good at styles.🤏 Smol.📜 openly licensed (Apache 2.0).⚗️ de-destiled, CFG optional
9
41
288
This is fun! A new leap!. You show the model 3-5 images of what you want, it 'learns' what it is and now you use it on your prompts! And the approach is be pluggable to different models (here they applied it to Latent Diffusion) . Code is not yet out - excited for it!
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion.abs: . “Textual Inversions”, operates by inverting the concepts into new pseudo-words within the textual embedding space of a pre-trained text-to-image model
4
18
159
@fofrAI expanding on this technique gives you even more super powers, helps with flux having some troubles with some abstract styles. IMG_0134.TIFF: a dog in the park --style impressionist painting, impressionism --details low
4
5
158
Stable Fast 3D has just been released by @StabilityAI, an incredible image-to-3D model that takes 0.5 to generate 3D assets 🤯 . Can you imagine games with assets generated on the fly? ✈️. Try out their official demo here and check it by yourself!
We are excited to introduce Stable Fast 3D, Stability AI’s latest breakthrough in 3D asset generation technology. This innovative model transforms a single input image into a detailed 3D asset in just 0.5 seconds, setting a new standard for speed and quality in the field of 3D
4
44
157
TRELLIS may be the biggest thing to happen to image-to-3D yet 🖼️ → 🧊. Microsoft simply cooked it SOTA, free and open source and casually dropped on @huggingface last week 🤗. ✶⋆.˚. try it for yourself ⋆✴︎˚。⋆ · .
9
30
153
Two days ago, @stabilityai quietly released CosXL and CosXL Edit, fine-tuned SDXL models that can produce full color range images ⬛⬜. You can now try them out on @huggingface! 🕹️. ▶️
6
35
145
The community has uploaded more than 7000 Flux[dev] LoRAs to @huggingface 🤗🎊 . Browse them all 🔍 and test them out for free 🖼️ ✨. ▶️
7
35
145
The @huggingface Hub now has `model templates`: instead of a blank `/new` page: a page tailored towards .uploading a specific kind of model 📙🎨. The first model template is one of the most requested: SD LoRAs! Share it with your fine-tuner friends 🤗.
7
26
136
Würtschen: a new, trained from scratch high res (1024x1024) model by @dome39931447 . Inference is at a fraction of SDXL. And trained with 6x less compute than SD1.4. Quality trade-offs 🤔? Try it for yourself! . PS: this video is not sped up! .
4
32
133