Ostris Profile Banner
Ostris Profile
Ostris

@ostrisai

Followers
1,862
Following
172
Media
156
Statuses
625

AI / ML researcher and developer. Forcing rocks to think since 1998. ML at - @heyglif

Denver, Co
Joined August 2023
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@ostrisai
Ostris
6 months
I teared up a bit. I am extremely excited, but also feel completly inadequate in literally everything I have ever worked on. Ever. It is absolutely stunning and humbling to watch. I need a drink.
@OpenAI
OpenAI
6 months
Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”
494
1K
12K
24
30
675
@ostrisai
Ostris
10 months
New Stable Diffusion XL LoRA, Ikea Instructions. SDXL does an amazingly hilarious job at coming up with how to make things. Special thanks to @multimodalart and @huggingface for the GPU grant!! HF -> Civitai ->
Tweet media one
Tweet media two
Tweet media three
Tweet media four
13
79
485
@ostrisai
Ostris
8 months
Early alpha demo of my "virtual try on" I have been working on. Load a few photos of a person, a photo of a top, enter a prompt, and instantly render them wearing it in any scene you want. Special thanks to my beautiful wife for letting me use her likeness.
17
48
386
@ostrisai
Ostris
1 month
Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)
8
44
264
@ostrisai
Ostris
2 months
I trained a new VAE with 16x depth and 42 channels (kl-f16-d42). I am now training SD1.5 to work with it, which will double the output size of SD1.5 without much additional compute overhead. Every time I train a new latent space, it always starts out inverted. It's so odd.
14
35
239
@ostrisai
Ostris
5 months
I just released a new IP adapter for SD 1.5 I'm calling a Composition Adapter. It transfers the general composition of an image into a model while ignoring the style / content. A special thanks to @peteromallet , it was their idea. Samples in🧵
12
50
223
@ostrisai
Ostris
5 months
Just added a SDXL version of the IP Composition Adapter, which injects the general composition of an image into the model, while mostly ignoring content and style. It now supports SDXL and SD 1.5. Some samples in 🧵
7
34
205
@ostrisai
Ostris
3 months
SD1.5 with a Flan T5 XXL text encoder is cooking 🔥with parent teacher training. >400k steps in. Most generic concepts are transferred. I am really loving how it is turning out so far.
Tweet media one
14
15
137
@ostrisai
Ostris
2 months
PixArt Sigma is now ranked higher than SD3 on imgsys. We all need to start giving PixArt more love. Plus, it is openrail++.
Tweet media one
6
12
101
@ostrisai
Ostris
30 days
This simple change allows you to use 4090s in a datacenter. Follow me for more life hacks.
Tweet media one
@vikhyatk
vik
30 days
@giffmana I just checked the GeForce license and it looks like they carved out an exception for crypto. So if I find a way to put this on the blockchain…
Tweet media one
3
2
33
1
10
98
@ostrisai
Ostris
4 months
Stable Diffusion 1.5 but with the CLIP Big G text encoder. This was an experiment that I probably dedicated too much compute to. To realize its full potential, it needs some proper fine tuning. Regardless, here it is and it works with 🤗inference api.
4
17
98
@ostrisai
Ostris
7 months
TinyLlama is amazing! I have been waiting on a <3B permissive model to come out. Fine tuning small LLMs to do very specific tasks has so much potential. I loaded it up in my prompt upsampler and it works shockingly well. 🧵
4
7
78
@ostrisai
Ostris
1 month
The SD1.5 version is probably done. Currently at 240k steps . I am trying to cleanup fine detail, but it may have reached the limit of what synthetic data from the parent model can achieve. Will run it through the night on some high res fix images which will hopefully help.
@ostrisai
Ostris
1 month
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
1
5
59
6
4
65
@ostrisai
Ostris
1 month
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
@ostrisai
Ostris
1 month
Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)
8
44
264
1
5
59
@ostrisai
Ostris
2 months
Training my first SD3 LoRA. It is hacky and probably won't be able to run on anything other than my trainer for now, but it is cooking. I am sure I am missing some stuff, but we will see.
Tweet media one
5
0
56
@ostrisai
Ostris
2 months
Released a LoRA for SDXL that converts the latent space to the SD1/2 latent space.
@ostrisai
Ostris
3 months
Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
3
39
2
4
53
@ostrisai
Ostris
22 days
The prompt comprehension is incredible! #auraflow "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "
Tweet media one
3
5
54
@ostrisai
Ostris
4 months
Cooking a style only IP adapter for SDXL. It still has a ways to go, but it is looking promising.
Tweet media one
3
3
48
@ostrisai
Ostris
24 days
Changing name to LittleDiT since I decided to increase the size a bit. Moved from T5 base to T5 large and going with 20 blocks in DiT vs 10. Still a lot smaller than SD1.5 with everything baked in. Now we cook, for a long time. Current samples attached.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
0
48
@ostrisai
Ostris
22 days
To me, the most exciting thing about Auraflow is that it is actually an open source license, Apache 2.0. CreativeRail++, while being permissive, is not actually an OSI compliant license. I am super excited to sink my GPUs into it this weekend!
2
4
48
@ostrisai
Ostris
8 months
Training a Stable Diffusion LoRA that can do 1 step is HARD. You have to get pretty creative with the timestep to prevent from going pure adversarial loss, which I don't want to do. I think I have it now. Just needs to cook for a while. Current train samples of 1 vs 2 step. SD1.5
Tweet media one
Tweet media two
5
4
46
@ostrisai
Ostris
5 months
The IP composition adapter is sitting on #5 on 🤗trending text to image models. Thank you all for the support.
Tweet media one
2
2
46
@ostrisai
Ostris
4 months
Style IP Adapter for SDXL is coming along. I love the impasto style people. Some content is still coming through. Working on that. I also figured out a novel way to compensate for inference CFG during training to prevent over saturation. Hopefully done tomorrow.
Tweet media one
Tweet media two
2
4
44
@ostrisai
Ostris
27 days
Tiny DiT: Images are coming through. It is basically going to need a full fine tune though. Debating on committing to that because I REALLY want this. Entire model w/ TE is 646MB. Full fine tune takes 2.4GB VRAM and I can train at a BS of > 240 on a 4090.
Tweet media one
Tweet media two
@ostrisai
Ostris
27 days
Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?
2
4
38
4
6
44
@ostrisai
Ostris
10 months
New Stable Diffusion XL LoRA - "Super Cereal". Turn anything into a cereal box. Special thanks to @multimodalart and @huggingface for the compute! HF -> Civitai ->
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
6
43
@ostrisai
Ostris
6 months
Everyone with a AI video startup right now
1
4
43
@ostrisai
Ostris
2 months
Training montage for those who enjoy watching models train as much as I do.
@ostrisai
Ostris
2 months
Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
1
33
1
0
43
@ostrisai
Ostris
4 months
I logged into Civitai for the first time in a long time. There is just blatant unchecked child porn everywhere. I took all of my models off of there. I do not support nor want to be associated with that in any way shape or form.
9
1
42
@ostrisai
Ostris
22 days
SDXL 16ch VAE adapter is still coming along, but it still has a long way to go.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
41
@ostrisai
Ostris
9 months
It has taken dozens of iterations and keyboard smashing to figure out all of the math. Hours to curiate and modify the 3k guidance image pairs. But it is working!! I will fix stable diffusion hands! Training sample from my private SDXL realism model. Step 0 and 600
Tweet media one
6
2
40
@ostrisai
Ostris
3 months
Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
3
39
@ostrisai
Ostris
20 days
16ch SDXL VAE adapter sample image. Prompt: "woman playing the guitar, on stage, singing a song, laser lights, punk rocker". In many ways, the 4ch VAE made training easier because the VAE made up most of the fine details. Now, the unet has to learn it. Needs to cook more.
Tweet media one
4
2
39
@ostrisai
Ostris
27 days
Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?
2
4
38
@ostrisai
Ostris
3 months
Saturday experiment: Single value adapter. I trained feeding a single -1 to 1 float directly into key/val linear layers that corresponds to eye size in images, and apply that to the cross attn layers. It works. Sample images are -1.0 and 1.0 pairs. Time to add more features.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
3
35
@ostrisai
Ostris
28 days
"You will not use the Stability AI Materials or Derivative Works, or any output .... to create or improve any foundational generative AI model" Is still in there. My understanding was that this is why @HelloCivitai refused to host it in the first place.
@StabilityAI
Stability AI
29 days
At Stability AI, we’re committed to releasing high-quality Generative AI models and sharing them generously with our community of innovators and media creators.  We acknowledge that our latest release, Stable Diffusion 3 Medium, didn’t meet our community’s high expectations, and
Tweet media one
67
169
642
7
3
34
@ostrisai
Ostris
2 months
Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
1
33
@ostrisai
Ostris
1 month
I trained another VAE (kl-f8-d16 -16 ch), and I test trained SD 1.5 to use it. It picked it up very quick, but the fine details need work. Overall, the test worked. Trying to decide if I want to switch to SDXL or PixArt Sigma. Thoughts?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
4
33
@ostrisai
Ostris
19 days
I cannot get my 16ch adapter for SD1.5 where I want it without doing a full fine tune. I also cannot get my flan T5xl adapter there either. So I merged them into a single model together and I am doing a full tune of a T5xl- 16ch-SD1.5 model. We will see.
3
0
34
@ostrisai
Ostris
15 days
6 years ago I spent days creating a dataset and fine tuning a classifier on Star Wars images so I could use it in deep dream. Instead of dog faces, there are droids, Chewbacca fur, and Boba Fett helmets. This blew people’s minds back then. We have come a long way.
Tweet media one
4
2
29
@ostrisai
Ostris
9 months
This is absolutely awesome. Results from my initial tests look good and it generates similar structured images to SDXL with same prompt and noise. Doing a fine tuning run now.
6
1
28
@ostrisai
Ostris
12 days
Ran a quick lora test doing this on SD 1.5. It added a lot of fine detail. Not sure what a longer train run would do as it also increases the contrast. Before and after samples attached
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@ostrisai
Ostris
12 days
Instead of adjusting the scaling factor, one could probably just multiply the noise by 0.934 during detail to increase the fine detail. I will test this theory.
1
0
10
2
4
28
@ostrisai
Ostris
4 months
@victormustar Add a label if the license is open source compliant or non open source compliant. Hugging Face is full of Fauxpen source models claiming to be "open" and "open source" when they are not.
Tweet media one
2
0
25
@ostrisai
Ostris
8 months
Hmmm.... I may pass on that look.
Tweet media one
3
1
21
@ostrisai
Ostris
8 months
It took all day to build the face/head/hair dataset, and get it the way I wanted it, so I could train a custom IP+LoRA adapter for them. 27k+ masked pairs. Early results look amazing. Now we just let that slow roast for a while. I have an idea for 1 shotting body types next.
1
1
18
@ostrisai
Ostris
5 months
Am I the only person in the image gen AI space that doesn't get the anime thing? It is so bizarre to me how many papers/code bases are like "As you can see from this sexy cartoon little girl, prompt comprehension is vastly improved". Just... why?
8
0
21
@ostrisai
Ostris
10 days
I'm training SDXL to use my 16ch VAE, and adjusting the VAE scale factor. The first image is training with a calculated norm scaler. The second is when I adjust it to a lower number, which is clearly closer to the ideal value. Any idea how to calculate the ideal value?
Tweet media one
Tweet media two
5
1
23
@ostrisai
Ostris
10 months
I added a "caption upsampling" node for ComfyUI, based on the fantastic work of @RisingSayak . Caption upsampling is the future of text to image generation. Total game changer! Repo:
Tweet media one
3
4
20
@ostrisai
Ostris
20 days
The best life advice I have is "You cannot change the people around you. But you can change the people AROUND you." I will burn a bridge at a moments notice and never look back. Surround yourself only with good people and ruthlessly trim the fat. Life is too short to hesitate.
2
1
19
@ostrisai
Ostris
12 days
Experimenting adjusting the scale_factor of my 16ch vae while fine tuning a model to use it. If I train with a sf too high, the details explode. Too low, everything turns smooth. SD 1 VAE sf is 0.18215 but it should be around 0.19503. Probably why everything looks too smooth.
1
2
19
@ostrisai
Ostris
9 months
This is a training sample generated at 1.0 CFG 4 steps on a LCM LoRA trained from scratch on my Objective Reality model (SD1.5) with all losses being ran in the pixel space (L2, style, content). This is step 500. Detail! So much detail! It learns to compensate for the lossy VAE.
Tweet media one
4
0
17
@ostrisai
Ostris
3 months
I have been using llama 3 70B instruct over ChatGPT 4 on 🤗 chat as my go to the past few days. I used it side by side since it came out and it is the first model I have tested that I prefer over ChatGPT 4. Open AI better do something soon.
Tweet media one
5
0
18
@ostrisai
Ostris
5 months
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
18
@ostrisai
Ostris
6 months
I did a transfer learning test from SDXL -> SD1.5 for ~60k steps. I am now 100% convinced the SDXL VAE is the reason for 99% of the issues with SDXL. The bad eyes / teeth and weird artifacts seem to be a result of training in that latent space. Time to reverse it.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
1
17
@ostrisai
Ostris
3 months
Tweet media one
0
3
17
@ostrisai
Ostris
22 days
"Batman hugging a saguaro cactus on a space ship. The cactus has a sign over it that says, "Free hugs". Saturn is in the background outside the window of the ship." #auraflow
Tweet media one
0
0
17
@ostrisai
Ostris
6 months
Please stop with the non commercial licenses. They affect every derivative work. A model trained on a dataset captioned with a model trained on a dataset generated by a model trained on a non com dataset, must be non com. It stifles innovation and poisons everything it touches.
3
1
16
@ostrisai
Ostris
3 months
I plan to do a proper test to confirm, but I am seeing a significant improvement on SD LoRA training with full FP32 over FP16 and BF16. It is a very noticeable and non trivial improvement. Has anyone else experienced something similar?
7
0
16
@ostrisai
Ostris
10 months
In stable diffusion, is it possible to train an outfit from, a single image, that allows complete scene and pose freedom (no control nets), that also plays nicely with custom character LoRas? Apparently it is. I might actually do a tutorial on this one because it is awesome!
Tweet media one
4
0
16
@ostrisai
Ostris
2 months
@SubarcticRec The original latent diffusion repo has a built in trainer for VAEs. I basically used it with with a few modifications for my own dataset.
0
3
16
@ostrisai
Ostris
19 days
It is training nice and smooth. I updated some training sample prompts for this. I love this one. Cannot wait to see where it goes. Prompt: "Profile picture of a Fat Shark, a 3D cartoon character, very cute. Holding smoothie in a glass"
Tweet media one
@ostrisai
Ostris
19 days
I cannot get my 16ch adapter for SD1.5 where I want it without doing a full fine tune. I also cannot get my flan T5xl adapter there either. So I merged them into a single model together and I am doing a full tune of a T5xl- 16ch-SD1.5 model. We will see.
3
0
34
2
0
15
@ostrisai
Ostris
5 months
I would release a lot more stuff if I didn't have to support it afterwards. For example, I trained a SD 1.5 model to use CLIP big G TE weeks ago. It works great, but I have 0 interest in writing all the plugins to have it work everywhere. I love training, I hate documenting.
5
0
15
@ostrisai
Ostris
2 months
Combining SD3 Medium and Luma Dream Machine is almost indistinguishable from reality.
1
1
15
@ostrisai
Ostris
3 months
@mervenoyann You are absolutely correct. They are training samples, so I only have a few to choose from, but it was in bad taste to use that one. I will do better next time.
1
0
14
@ostrisai
Ostris
5 months
I fine tuned Pixart Alpha for 100k steps to make a 100% SFW cartoon model for a children's app. Pixart cannot generate NSFW content, making it ideal for critical SFW generation. However, it was mostly trained on real images, so it took a lot of training to teach it cartoon style.
3
0
13
@ostrisai
Ostris
2 months
I have been working on a face dataset with the goal to train a face identity model for a while. I have 36k hand curated identity pairs, but it is not enough. So I am using those to fine tune SD1.5 model that can kick out infinite identity pairs. It seems to be working well.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
14
@ostrisai
Ostris
4 months
I setup an impasto painting style LoRA to train last night that I accidently set to run for 100k steps instead of 6k steps. BS 8 and 378 training images. It made for a nice montage to see what overfitting looks like. Deep fried goodness.
1
0
13
@ostrisai
Ostris
2 months
@AIWarper Basically. Half sized latent space, so generates natively at 1024x1024. A simple script could merge existing fine tunes to work with it. All LoRAs should work.
1
1
13
@ostrisai
Ostris
5 months
So, where is this open source Grok?
1
2
11
@ostrisai
Ostris
21 days
@mo_norouzi @sudomaniac77859 Ooof, you guys probably want to change that so you don't look like complete hypocrites. Curious how many TOS violations are in the Ideogram dataset. I wouldn't poke that bear.
0
0
12
@ostrisai
Ostris
8 months
I have been training a specialized IP adapter since yesterday. It was working ok, but was lacking the quality I was wanting. 1 config change later, and I am now training an IP adapter + LoRA, and amazing things are happening.
5
0
12
@ostrisai
Ostris
4 months
@ClementDelangue
clem 🤗
4 months
Should we acquire Stability and open-source SD3?
434
206
3K
0
0
10
@ostrisai
Ostris
5 months
The US Copyright Office ruled that the outputs of AI models cannot be copyrighted. Does this mean transfer learning will render the student model free and clear of any licencing issues since it is trained solely on non copyrightable information?
4
1
12
@ostrisai
Ostris
3 months
I am doing an experiment to convert SD 2.1 to use epsilon prediction instead of v-prediction because.. well.. because I want to. This image just popped up in a training sample. The model seems displeased. Look model, stop judging me and just do what you are told.
Tweet media one
1
0
11
@ostrisai
Ostris
4 months
Someone needs to train a clip vit that can handle multiple aspect ratios.
3
0
11
@ostrisai
Ostris
10 months
The sweet spot to fine tuning SDXL, (and some tricks) based on months of experimenting, seems to currently be: (continued in thread)
2
0
11
@ostrisai
Ostris
16 days
Does anyone know how Midjourney's sref seeds work? Are they generating TE embeddings, cross attn embeddings (IP adapter ish), a small LoRA, or something else? I am assuming it is a GAN that takes noise to generate some type of embedding.
2
0
11
@ostrisai
Ostris
28 days
@dome_271 It ranks higher than SD3 medium on user preference on .
Tweet media one
0
0
10
@ostrisai
Ostris
5 months
I am finally starting to see some convergence with a one shot identity LoRA generator. It if works, it will kick out an actual 128 dim LoRA you can save and share from a single, or a few images in one shot. Fingers crossed. I have been working on this a very long time.
3
0
10
@ostrisai
Ostris
4 months
Open Source is CHARITY. It is a FREE and PERMISSIVE license by definition. Not donating to charity is fine, but pretending you do when you don't is a giant FU to open source. Please stop pretending these companies doing a poison pill license cash grab are being charitable.
1
0
10
@ostrisai
Ostris
12 days
Instead of adjusting the scaling factor, one could probably just multiply the noise by 0.934 during detail to increase the fine detail. I will test this theory.
1
0
10
@ostrisai
Ostris
6 months
@literallydenis I just often convince myself that I can compete with, and outpace, the big guys as a solo. But I can't. I refuse to accept that, but maybe it is time I do.
2
0
9
@ostrisai
Ostris
7 months
SigLIP is now supported by HF transformers library (repo) and a 512 version is up. I have been eyeballing the PR for a while. I loaded it into my IP adapter trainer, and now that is training. I'm super excited to see what it can do with SigLIP 512
1
0
9
@ostrisai
Ostris
5 months
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
9
@ostrisai
Ostris
1 month
What would be the best way to segment text in an image? I need something where I can end up with a binary mask of not the area, but the letters themselves. In the wild text (billboards, magazine covers, license plates, etc)
8
0
9
@ostrisai
Ostris
6 months
@jenny____ai It is odd. It feels like being the first caveman with fire, and all the other cavemen are like “naw, we’re good being cold, eaten by bugs, and dying from food poisoning, nerd.”
3
0
9
@ostrisai
Ostris
6 months
Me: I'll just increase the learning rate on my face encoder a tiny bit. My face encoder:
Tweet media one
6
0
8
@ostrisai
Ostris
10 months
I have been cooking a sd1.5 model on a new dataset at 1024 resolution. This one prompt in the training samples has just been kicking out pure gold. "at a house party, fish eye lens, action shot". It turns out that fisheye is one word.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
9
@ostrisai
Ostris
7 months
Training samples from a face encoding adapter I am tinkering with. I basically took concepts from photo maker and gave them steroids. Complete hidden layer fusion of every token of the text and vision encoders with a zipper fuse module. Face isolation this early is very promising
1
0
9
@ostrisai
Ostris
7 months
36k steps in training a SigLIP 512 SDXL IP+ face adapter. It is coming along. I wish I could speed up training. Machine learning is so frustrating for an impatient person who constantly gets distracted by new shiny things. Please, no one release anything new for a few days.
1
1
9
@ostrisai
Ostris
5 months
Fauxpen Source Claiming to be open source while also restricting free use.
2
0
9
@ostrisai
Ostris
4 months
Incredible use of multiple technologies!
@daniel_eckler
Eckler by Design ✦
4 months
C3PO x Childish Gambino 🤖 👑 100% AI (Official Music Video) @openAI + @runwayml + @suno_ai_ + @resembleai + @fable_motion + @midjourney + @topazlabs
561
1K
5K
2
0
9
@ostrisai
Ostris
8 months
@JesusPlazaX Not yet, but hopefully in a few days.
4
0
7
@ostrisai
Ostris
5 months
1/16 SD1&2 VAE. Drop in place VAE to double the image size. I did some manual slicing of the SD1/2 VAE weights to add another block with only 64 ch on the encoder and decoder to shrink the latent space by half. I am going to tune it some more, but this is the current in and out.
Tweet media one
0
0
8
@ostrisai
Ostris
8 months
@lithium534 It is just a face IP adapter and a custom IP adapter + LoRA I have been training for clothing. No segmenting or any other controls. I am planning on training my own face IP adapter today to improve the likeness.
0
0
8
@ostrisai
Ostris
9 months
Testing my new targeted guidance training method for stable diffusion on a 3 altered image dataset for a big ear slider. (dataset in thread). It works amazingly well without overfitting. Video is of training samples on SD1.5 0-500 steps. What should I try next? (no not that)
3
0
8
@ostrisai
Ostris
7 months
I woke up with an idea to improve the training method for HD Helper LoRA . The new method works great at 1024. So, I figured, why not generate SD1.5 natively at, lets say, 2056x2056? That is training now, slowly. Step 0 - 400 samples below. Long way to go.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
8
@ostrisai
Ostris
4 months
Holy mother forking shirt balls! Emad resigned!
@StabilityAI
Stability AI
4 months
An announcement from Stability AI:
111
219
929
2
0
8