Ostris
@ostrisai
Followers
6K
Following
2K
Media
227
Statuses
967
AI / ML researcher and developer. Forcing rocks to think since 1998. Patreon - https://t.co/ew5M6LdJkO ML at https://t.co/IhAXPimwqM - @heyglif
Denver, Co
Joined August 2023
Did a lot of testing on my LoRA training script for @bfl_ml FLUX.1 dev model. Amazing model! I think it is finally ready. Running smooth on a single 4090. Posting a guide tomorrow. Special thanks to @araminta_k for helping me test on her amazing original character and artwork.
35
97
680
I teared up a bit. I am extremely excited, but also feel completly inadequate in literally everything I have ever worked on. Ever. It is absolutely stunning and humbling to watch. I need a drink.
Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”
24
29
667
Testing Flux.1 schnell embedding interpolation. I told claude some details about my life and had it generate 40 prompts from being born -> being dead using the data I gave it. Then interpolated between the embeddings for this video. Uses @araminta_k softpasty LoRA
30
84
645
New Stable Diffusion XL LoRA, Ikea Instructions. SDXL does an amazingly hilarious job at coming up with how to make things. Special thanks to @multimodalart and @huggingface for the GPU grant!!. HF -> Civitai ->
13
78
474
I take this back. I managed to squeeze LoRA training for FLUX.1-schnell-train on a single 4090 with 8bit mixed precision. We will see how well it works. 3s/iter.
This can be optimized further. I think you could maybe do mixed precision 8bit quantization on the transformer (maybe). But, no matter how optimized it gets, I don't think it will ever be possible to train on current consumer hardware (<=24gb). Someone please prove me wrong.
13
26
251
I haven't spoken much about my ongoing project to de-distill schnell to make a permissive licensed version of flux, but I have been updating it periodically as it trains. I just noticed it is the #2 trending text-to-image model on Hugging Face. Working on aesthetic tuning now.
13
28
248
I just released a new IP adapter for SD 1.5 I'm calling a Composition Adapter. It transfers the general composition of an image into a model while ignoring the style / content. A special thanks to @peteromallet , it was their idea. Samples in🧵.
12
49
216
It won't be long now. Note: This is not a traditional IP adapter. We are going directly from the SigLip 512 base last hidden state into the attn layers. We are fine tuning the entire vision encoder as well, it is a small encoder but has more than 2x the resolution of CLIP.
We're cooking up a FLUX.1-dev IP Adapter at @heyglif using SigLip 512 base for the vision encoder. It is still cooking, but it is getting there. Current samples (left input - right output).
7
17
179
I set up a 🤗 HF space if you want to test out Flex.1-alpha without downloading it
I am happy to release the next iteration of OpenFlux, Flex.1-alpha. Flex.1 is an 8B param model with an Apache 2.0 license. It has been trained a significant amount since OpenFlux, and has a fancy new guidance embedder. Still a WIP. More to come.
6
24
126
I have had twitter for a year, as of today. Somehow, I have acquired close to 4k followers from almost exclusively posting about ML experiments I am working on. I never expected anyone to be interested in my work. Thank you all for nerding out with me this year. #MyXAnniversary
4
0
109
They had H200s available on @runpod_io this morning. I have never seen them available and have never gotten to mess with one. I spun up a pod and I am transfering a dataset now. Excited to see how they perform.
11
3
100
This simple change allows you to use 4090s in a datacenter. Follow me for more life hacks.
@giffmana I just checked the GeForce license and it looks like they carved out an exception for crypto. So if I find a way to put this on the blockchain…
1
9
97
I added a post with the results of skipping each block in FLUX.1 dev here ->
Experimenting skipping Flux blocks. First image is all blocks. 2nd image is skipping MMDiT blocks 3, 6, 7, 8, 9, 10, 13. With a little tuning, it would improve farther. Prompt: a woman with pink hair standing in a forest, holding a sign that says 'Skipping flux blocks'
8
11
82
The SD1.5 version is probably done. Currently at 240k steps . I am trying to cleanup fine detail, but it may have reached the limit of what synthetic data from the parent model can achieve. Will run it through the night on some high res fix images which will hopefully help.
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
5
3
61
I am training a Diffusion Feature Extractor with lpips outputs as targets now. It is still cooking, but I pulled an early version out and finetuned Flex.1-alpha for ~2k steps with it. It is really cleaning up the features, especially text. I am super excited about this.
Diffusion Feature Extractor: Training a model that can extract features from RF diffusion model outputs. You can use the features as an additional loss target. I tested an early version of it with a Flex.1 finetune. This is before/after 250 steps. It works too well. 🧵
6
7
68
Be super careful with your FLUX.1 training captions people.
Everyone who said captions don't do anything for Flux is wrong because I captioned a dog as a "cat" and it was ONE picture, and now the model is beautiful except whenever I prompt cat or kitten it gives me dogs. Also shoutout to @comfydeploy for the awesome service.
2
3
59
Excellent work @multimodalart !.
FLUX.1 ai-toolkit now has an official UI 🖼️ with @Gradio . With this open source UI you can 💻, locally or any cloud: .- Drag and drop images 🖱️.- Caption them ✏️ (or use AI to caption 🤖).- Start training 🏃. No code/yaml needed 😌. Thanks for merging my PR @ostrisai 🔥
0
2
60
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon).
1
5
56
I trained Marty McFly on 6 images for 1500 steps and 6 LoRAs on each image for 250 steps each, and merged them together on inference. In short, no, it does not work as well as training on all images at the same time.
I am curious if you would get similar results from training a LoRA on 10 images as you would training 10 LoRAs on single images with 1/10th the steps, and then merging them together. Has anyone tried anything like this?.
9
1
50
Current training sample with this prompt from the 7B param version. It is still in the early stages and has a ways to go, but I am pretty happy with the current prompt comprehension after pruning out 5B params (41%) from the model.
I am honestly pretty happy with how OpenFLUX is turning out. I never expected to actually get it to where it is currently, and it still has a long way to go before it is where I want it to be.
3
4
48
First LoRA test I tried was with live action Cruella because in my pruning attempts, anatomy broke down. Plus that hair and her clothing is hard to learn. It worked out well. Final step training samples attached.
How did I miss this? Too many releases. @freepik Awesome work! It works out of the box with ai-toolkit. Training first test LoRA now.
5
0
46
@bfl_ml @araminta_k For personalization I also did a celeb test with Christina Hendricks since the model didn't seem to know her well. It works well on realism personalization. Attached is before and after training a LoRA on FLUX.1-dev on Christina Hendricks.
4
2
45
Tiny DiT: Images are coming through. It is basically going to need a full fine tune though. Debating on committing to that because I REALLY want this. Entire model w/ TE is 646MB. Full fine tune takes 2.4GB VRAM and I can train at a BS of > 240 on a 4090.
Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?.
4
6
41
New Stable Diffusion XL LoRA - "Super Cereal". Turn anything into a cereal box. Special thanks to @multimodalart and @huggingface for the compute!. HF -> Civitai ->
2
6
43
@DMBTrivia @kohya_tech It was trained on thousands of schnell generated images with a low LR. The goal was to not teach it new data, and only to unlearn the distillation. I tried various tricks at different stages to speed up breaking down the compression, but the one that worked best was training with.
2
7
31
@McintireTristan Dataset is thousands of images generated by Flux Schnell. Fine tuning is not possible because it is distilled. What is I doing is intentionally breaking down the step compression to hopefully end up with a training base that we can train LoRAs on that will work with schnell.
8
1
35
"You will not use the Stability AI Materials or Derivative Works, or any output . to create or improve any foundational generative AI model" Is still in there. My understanding was that this is why @HelloCivitai refused to host it in the first place.
At Stability AI, we’re committed to releasing high-quality Generative AI models and sharing them generously with our community of innovators and media creators. We acknowledge that our latest release, Stable Diffusion 3 Medium, didn’t meet our community’s high expectations, and
7
3
32