Johannes Stelzer Profile Banner
Johannes Stelzer Profile
Johannes Stelzer

@j_stelzer

Followers
1,549
Following
668
Media
47
Statuses
272

Pioneering closed-loop gen AI | Immersive AI Systems at the Champalimaud Center for the Unknown @Neuro_CF | Open-source real-time AI @lunarringart

Portugal
Joined January 2023
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@j_stelzer
Johannes Stelzer
10 months
Testing real-time controlnet diffusion with new experimental embedding averaging (aka gollum mode)
2
3
36
@j_stelzer
Johannes Stelzer
3 months
New way to navigate latent space. It preservers the underlying image structure and feels a bit like a powerful style-transfer that can be applied to anything. The trick is to...
70
422
3K
@j_stelzer
Johannes Stelzer
3 months
Happy mixing of decoder embeddings in real-time! Base prompt is ‘photo of a room, sofa, decor’ and the two knobs are ‘industrial’ and ‘rococo’. If you are wondering what is running there in the background…
28
103
814
@j_stelzer
Johannes Stelzer
3 months
selectively alter the embeddings in the decoder part of the diffusion process. The demo is powered by SDXL Turbo and is running in realtime. The MIDI controller is a great way of modifying variables in real time (see ). The prompts were...
7
13
206
@j_stelzer
Johannes Stelzer
2 years
Introducing Latent Blending: a new #stablediffusion method for generating incredibly smooth transition videos between two prompts within seconds.
6
31
150
@j_stelzer
Johannes Stelzer
10 months
Fun with real-time diffusion & controlnet
4
19
132
@j_stelzer
Johannes Stelzer
2 years
Happy to share the first #latentblending Web Interface:
Tweet media one
10
20
103
@j_stelzer
Johannes Stelzer
3 months
"photo of a red brick house, blue sky" as base prompt, the new decoder embeddings were "coral", "moss", "fire", "ice", "sand", "rusty steel" and "cookie".
3
2
66
@j_stelzer
Johannes Stelzer
3 months
… then yes - it is our real-time adaptation of comfyui! we’ve built a bunch of nodes, including midi controllers. Especially happy about…
4
1
57
@j_stelzer
Johannes Stelzer
3 months
… our GL render window that can be placed anywhere and also made fullscreen. Nodes coming soon!
4
3
54
@j_stelzer
Johannes Stelzer
2 months
sure midi controllers are great… but body tracking (here IR) allows us to use our own body to embody these strange new worlds
4
8
52
@j_stelzer
Johannes Stelzer
2 years
#latentblending is now available for everyone on @huggingface Spaces, right from your browser: . Create seamless video transitions for your text prompts with buttery smoothness.
@_akhaliq
AK
2 years
Latent Blending @Gradio demo: a stable diffusion method for generating incredibly smooth transition videos between two prompts within seconds by @j_stelzer
1
10
55
2
6
39
@j_stelzer
Johannes Stelzer
2 years
You can clone the source-code here: Coming soon: Huggingface, Multivideo, Depth...
0
3
33
@j_stelzer
Johannes Stelzer
2 years
Great example how structures can be preserved across latents. Will be further enhanced with ControlNet Integration 🔜
@yvrjsharma
Yuvi
2 years
😍The Latent Blending @Gradio demo by @j_stelzer is out on @huggingface Spaces and is going brrr with community grant GPUs. 🚀Watch this awesome animation done by blending images at two ends. Explore the Demo on🤗and Unleash your creativity now!! -
1
3
16
1
4
28
@j_stelzer
Johannes Stelzer
1 year
@dr_cintas what they seem to have missed though: how often did the AI/physicians get it completely wrong?
3
1
27
@j_stelzer
Johannes Stelzer
9 months
looks like the best text2vid currently. wondering if bytedance releases code/model, no mention at
@_akhaliq
AK
9 months
ByteDance just announced MagicVideo-V2 Multi-Stage High-Aesthetic Video Generation paper page: The growing demand for high-fidelity video generation from textual descriptions has catalyzed significant research in this field. In this work, we introduce
22
117
534
0
1
22
@j_stelzer
Johannes Stelzer
2 years
How it works: latent blending splits and remixes latent representations using spherical linear interpolations from #stablediffusion . It is based on SD2.1, and supports 512, 768, inpainting and x4 upscaling.
2
2
24
@j_stelzer
Johannes Stelzer
2 years
... resulting in this video
0
1
22
@j_stelzer
Johannes Stelzer
2 years
175M generated images and 94 hits - that is a huge number of statistical comparisons. If one tries often enough, anything can happen.
@Eric_Wallace_
Eric Wallace
2 years
Models such as Stable Diffusion are trained on copyrighted, trademarked, private, and sensitive images. Yet, our new paper shows that diffusion models memorize images from their training data and emit them at generation time. Paper: 👇[1/9]
Tweet media one
168
2K
10K
0
1
15
@j_stelzer
Johannes Stelzer
5 months
Excited to merge high-performance motion tracking with real-time diffusion for music performances! Together with the incredible Ricardo Martins and Nico Espinoza at the Center for the Unknown in Lisbon, @Neuro_CF . Stay tuned for more from our immersive AI space. @lunarringart
3
2
14
@j_stelzer
Johannes Stelzer
2 years
@yvrjsharma I perfectly understand :) this one is using a single bird & HED.
1
3
12
@j_stelzer
Johannes Stelzer
10 months
far beyond of what can be done with img2img
1
0
11
@j_stelzer
Johannes Stelzer
2 years
Why it's cool: Latent blending allows you to create almost imperceptible smooth transitions between two different prompts/images. Unlike the awesome #deforum , latent blending renders within seconds, not minutes, making it much cheaper and faster to play with.
2
2
11
@j_stelzer
Johannes Stelzer
2 years
Structure preserving animations = mindfuck.
0
1
11
@j_stelzer
Johannes Stelzer
2 years
GANs vs Diffusion S03E05: Great Image quality, 0.13 seconds to synthesize 512px & friendly latent space. Too bad no model…
@_akhaliq
AK
2 years
Scaling up GANs for Text-to-Image Synthesis present our 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive
40
290
1K
0
1
10
@j_stelzer
Johannes Stelzer
1 year
Wow, the EU AI act seems to be wild! If this comes through it is the end of generative AI in Europe, at least in terms of development. Great news for VPN providers though
@technomancers
Technomancers_ai
1 year
@MeetThePress @ericschmidt Fear reaction to what the EU is about to do.
31
67
315
0
3
9
@j_stelzer
Johannes Stelzer
2 years
amazing idea with the landescapades. ui tool to create the transitions will be available via gradiocolab tomorrow
@fabianstelzer
fabian
2 years
the first stablediffusion "optical illusion"? these landscapes look like still images, but the 1st and last frame shows entirely different images using @j_stelzer amazing latent blending code to traverse through the latent space
10
33
229
2
1
10
@j_stelzer
Johannes Stelzer
3 months
@thescuffedhouse yes! but your midi board needs to be ported (by you :)
3
0
10
@j_stelzer
Johannes Stelzer
3 months
@MadMonkMani absolutely! working on coupling this to real time audio generation, leveraging synesthesia & immersion
1
0
10
@j_stelzer
Johannes Stelzer
2 years
One of my 😻 @Gradio features: Event Listeners. In a nutshell they allow you to dynamically change your UI in whatever way you like.
@yvrjsharma
Yuvi
2 years
👋Gradio fans! Let's dive deeper into building @Gradio ML demos using✨Event Listeners✨ in Gradio Blocks! 🧱First things first,let's see how Blocks r structured. They contain Components tht r automatically added as they're created within `with gr.Blocks(). as demo:` clause 🧵👇
1
7
22
0
2
7
@j_stelzer
Johannes Stelzer
1 year
@GaryMarcus @ylecun GFM: Passive agressive tweeting is cool
1
0
9
@j_stelzer
Johannes Stelzer
2 years
@icreatelife I'm removing the paint from photos
0
0
9
@j_stelzer
Johannes Stelzer
2 years
Inpainting is great but have you heard of de-painting?
2
0
8
@j_stelzer
Johannes Stelzer
10 months
stay tuned for source code :)
1
0
8
@j_stelzer
Johannes Stelzer
10 months
however, flickering is ongoing issue, as small changes in the ctrlnet conditioning may lead to big changes in the generated image...
1
0
5
@j_stelzer
Johannes Stelzer
9 months
promising alternative to IP adapter for zero-shot face "finetuning". star their repo if you want to help accelerate the code/model release :)
@vladbogo
Vlad Bogolin
9 months
🧵 [1/n] Read about InstantID today. It's a new approach to creating personalized and consistent images. #AI #InstantID
Tweet media one
1
2
7
0
0
6
@j_stelzer
Johannes Stelzer
2 years
@giffmana you can use a t test if your data doesn’t violate the assumptions: next step is to calculate t-scores between all pairs, and account for multiple corrections (e.g. FDR or bonforroni, dividing the p-vals by nmb tests). hope nr1 vs nr2 p<0.05 !
1
2
6
@j_stelzer
Johannes Stelzer
2 years
Bridge Remodeling Takes a Trippy Turn #latentblending
1
0
6
@j_stelzer
Johannes Stelzer
2 years
Internet Explorer: Agent who automatically crawls the gaps in your text-to-image training DB. very neat idea!
@pathak2206
Deepak Pathak
2 years
ML datasets have grown from 1M to 5B images but are still tiny compared to Internet where billions are uploaded per day. Wish you could scale to entire web? 🌎Internet Explorer🌏✨: an online agent that, given a task, learns on the web, self-supervised!
8
90
364
0
2
5
@j_stelzer
Johannes Stelzer
2 years
@imnotfady haha great idea. love the part where you sampled the real Karens. What would happen if you put another Karen on the customer support end and let them talk?
3
0
5
@j_stelzer
Johannes Stelzer
1 year
Incredible way to make awesome personalized profile pictures: In a nutshell, fine-tuning SDXL on a couple photos on using @heyglif . Examples and how-to below!🧵
Tweet media one
2
1
5
@j_stelzer
Johannes Stelzer
1 year
Brain readings -> text = thought reading. Super interesting study showing how to read language/thoughts/meaning from brain scanner data. Works on perceived speech, imagined speech and silent videos. (unfortunately not on arxiv ) 🧵more below 👇
Tweet media one
1
0
5
@j_stelzer
Johannes Stelzer
7 months
wow! clearly the most interesting SORA videos so far. Confirms that generative AI empowers normies but gives superpowers to artists!
@OpenAI
OpenAI
7 months
A glimpse of our early work with artists and filmmakers to see how Sora can help bring ideas into reality:
452
912
5K
1
0
4
@j_stelzer
Johannes Stelzer
2 months
AI or die: reminds me of intergalactic cable from Rick&Morty
@aiordieshow
AI OR DIE
2 months
AI OR DIE (Pilot Episode)
123
160
639
0
0
4
@j_stelzer
Johannes Stelzer
2 years
@tunguz ‘tabular data so yesterday 😩‘
1
0
4
@j_stelzer
Johannes Stelzer
3 months
@DigThatData good idea yes!
1
0
4
@j_stelzer
Johannes Stelzer
4 months
SD3 distilled - anyone working on this soon?
@benjamin_aubin_
Benjamin Aubin
4 months
We are thrilled to unveil our first open-source project:⚡️ Flash Diffusion ⚡️ - a robust, versatile & efficient distillation method for diffusion models. 👉🏼 By @heyjasperai & @clipdropapp
Tweet media one
12
52
237
0
0
4
@j_stelzer
Johannes Stelzer
1 year
Control-GPT allows automatic spatial arrangement of prompts for txt->img, leveraging the visual wisdom of LLMs
Tweet media one
@_akhaliq
AK
1 year
Controllable Text-to-Image Generation with GPT-4 introduce Control-GPT to guide the diffusion-based text-to-image pipelines with programmatic sketches generated by GPT-4, enhancing their abilities for instruction following. Control-GPT works by querying GPT-4 to write TikZ code,
Tweet media one
8
62
280
0
1
3
@j_stelzer
Johannes Stelzer
2 years
‘under 15 seconds for 20 inference steps to generate a 512x512 image’ - all on a smartphone! amazing to see how much potential for acceleration there is
@QCOMResearch
Qualcomm Research & Technologies
2 years
Generative AI is now running completely on an edge device. Learn how @Qualcomm #AI Research deployed Stable Diffusion, a popular 1B+ parameter foundation model, on a @Snapdragon phone through full-stack AI optimization.
11
61
326
0
0
4
@j_stelzer
Johannes Stelzer
1 year
Ready for a sprinkle of magic?✨ arrives soon - your wand to conjure new forms of media with Generative AI! 🧙‍♂️🎨
@fabianstelzer
fabian
1 year
a quick morning jam about ☕️ hey @glifbot , make me.. - "a weird comic strip featuring myself and coffee" - "a fashion trend about coffee" - "a medieval meme about coffee" - "AI hypeboi thread about coffee"
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
3
33
0
0
4
@j_stelzer
Johannes Stelzer
8 months
can’t wait to see turbo version
@EMostaque
Emad
8 months
Testing something out (not #SD3 ), give me some prompts plz
134
11
272
0
0
1
@j_stelzer
Johannes Stelzer
2 years
at the cost of inference speed… furthermore the first diffusion pass is with 64x64px followed by two upscaling models. SD latent space has same dim and even one more channel.
@DiffusionPics
Stable Diffusion 🎨 AI Art
2 years
Open source Imagen coming soon. Pixel-space model (less artifacts), better text conditioning, model produces more coherent results than SD with perfect text. 1024x1024 generations with no upscaler or clone-tool artifacts #AIArt #StableDiffusion2 / #StableDiffusion #DreamStudio
Tweet media one
2
60
211
0
0
4
@j_stelzer
Johannes Stelzer
2 years
Preserving structures in transitions just got easier with the new version of latent blending.
2
0
4
@j_stelzer
Johannes Stelzer
1 year
super excited to share the news! glif is open for registration! glif is a fun new platform to stack together generative AI and build tomorrow's new media forms, without the need to code anything. check it out here:
@fabianstelzer
fabian
1 year
📢: I've teamed up with @jamiew & friends to build Glif - a fun new way for anyone to create, play & jam with "AI legos": 1. easily create AI micro-apps to generate images, stories, memes, comics.. 2. let others play, run & remix your glifs sign up:
Tweet media one
12
30
149
0
0
2
@j_stelzer
Johannes Stelzer
2 years
1
0
3
@j_stelzer
Johannes Stelzer
1 year
ChatGPT code interpreter = game changer for programming
@emollick
Ethan Mollick
1 year
This 🤯 is a very big 🤯 I have access to the new GPT Code Interpreter. I uploaded an XLS file, no context: "Can you do visualizations & descriptive analyses to help me understand the data? "Can you try regressions and look for patterns?" "Can you run regression diagnostics?"
Tweet media one
Tweet media two
Tweet media three
Tweet media four
146
1K
6K
0
0
3
@j_stelzer
Johannes Stelzer
2 years
@elesidd Nice work! And let me congratulate you - this is (to my knowledge) the first NFT made with #latentblending
1
0
3
@j_stelzer
Johannes Stelzer
10 months
@Vivim0rt actually just 9.2GB! Running on a RTX 3090
1
0
3
@j_stelzer
Johannes Stelzer
2 years
@eerac it essentially chains together the latents and thus the resulting images. you can think of it as a constrained diffusion. img2img: the strength determines the injection depth. thus only shallow blends would be possible
0
0
3
@j_stelzer
Johannes Stelzer
3 months
@msfeldstein not yet but stay tuned, we'll release it soon
0
0
3
@j_stelzer
Johannes Stelzer
2 years
F$%$ Google/MusicLM for their angst-and-$$$ driven outdated attitudes. Very much welcoming the quality and spirit of Moûsai:
@arankomatsuzaki
Aran Komatsuzaki
2 years
Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion Develops a cascading latent diffusion approach that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. abs: repo:
2
26
135
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@DigThatData Generative AI is a core component for digital therapeutics. We are currently helping building such a centre in a perfect location, DM me if you want to know more
0
0
3
@j_stelzer
Johannes Stelzer
2 years
0
1
3
@j_stelzer
Johannes Stelzer
1 year
@Buntworthy @StabilityAI actually it is planned to be released in public, see
@EMostaque
Emad
1 year
@pilkkupiste @deepfloydai Like SD original release, first research release (but more smooth than before) => feedback => public release as new architecture type
2
0
23
0
0
3
@j_stelzer
Johannes Stelzer
2 years
@eerac the keyframes are made via standard prompts. img2img unfortunately won’t work (so well…) as we need the full latent trajectory
1
0
3
@j_stelzer
Johannes Stelzer
5 months
We coupled the IR tracking system directly into the conditioning space of our real-time diffusion diffusion system, basically using velocities to blend the decoder conditioning space.
Tweet media one
1
0
3
@j_stelzer
Johannes Stelzer
1 year
Step4: visit this glif and enter your Glifmoji name.
Tweet media one
0
0
3
@j_stelzer
Johannes Stelzer
2 years
@mreflow @angrypenguinPNG @rileybrown_ai @stokebuilder @KaiberAI this quality of smoothness only with prompts. images is holy grail & wip
3
0
3
@j_stelzer
Johannes Stelzer
2 years
@fabianstelzer 😬It's impossible to unsee how Picasso also struggled with depicting hands
0
0
3
@j_stelzer
Johannes Stelzer
1 month
Learning a new language? Best way is through conversation and correction, like a tandem partner. Checkout this custom GPT I made:
Tweet media one
1
0
5
@j_stelzer
Johannes Stelzer
1 year
very beautiful … text 2 video on Monet’s wonderful work. should make an acid version 🫠
@GlennIsZen
Glenn Marshall
1 year
Monet's bridges. (text2video modelscope)
6
7
67
0
0
3
@j_stelzer
Johannes Stelzer
5 months
Really loved the process behind it, developing the visual spaces and language together with Ricardo. The performance took place at Metamersion in May 2024 in the Champalimaud Warehouse.
Tweet media one
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@stokebuilder @mreflow @angrypenguinPNG @rileybrown_ai @KaiberAI what i love about these times is that we can actually reach for them!
0
0
3
@j_stelzer
Johannes Stelzer
3 months
@nicolasmariar good idea yes! coming soon!
0
0
3
@j_stelzer
Johannes Stelzer
11 months
@Dan50412374 @ilumine_ai Ops sorry I meant vanilla sdxl TURBO takes 260ms. How do you get the further speedup?
0
0
2
@j_stelzer
Johannes Stelzer
1 year
@_SilkeHahn @rawxrawxraw ‘… widersprechen grundsätzlich den Prinzipien US-amerikanischer und europäischer Urheberrechte’ faktisch nicht korrekt - siehe fair-use Doktrin
1
0
2
@j_stelzer
Johannes Stelzer
2 years
@LovisCyance @fabianstelzer principally should work with all linear transforms and if running through upscaling it should be smooth as hell. nonlinear mb also good
1
0
2
@j_stelzer
Johannes Stelzer
3 months
@DeeperThrill @nJJJJRh in text2image (here) you can come back to yhe same location
0
0
2
@j_stelzer
Johannes Stelzer
2 years
try it yourself! gradio w colab backend here:
@lunarringart
Lunar Ring
2 years
Branching out: A surreal journey onto canvas. Made with #latentblending
0
1
3
0
0
2
@j_stelzer
Johannes Stelzer
1 year
Step3: select Glifmoji in the menu in the upper right and follow the instructions
Tweet media one
1
0
2
@j_stelzer
Johannes Stelzer
2 years
coming soon to @huggingface stay tuned
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@giffmana you basically generate the null distribution from the data via shuffling the labels. the resulting null distribution allows to estimate the likelihood of your real data. still have to do multiple testing correction though, e.g. with Benjamini Hochberg FDR
0
0
2
@j_stelzer
Johannes Stelzer
1 year
Mind-blowing: PanoHead, a 3D face GAN generating 360 degree full head views. So puzzling to see a face rotating and warping at the same time.
@linjieluo_t
Linjie Luo
1 year
Come check out 2 of our projects on generative avatars at CVPR 2023: 1. PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360° (, code available) 2. OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis #CVPR23
4
60
233
0
1
2
@j_stelzer
Johannes Stelzer
2 years
@JonasAndrulis ironischerweise sind Dokumentation und Inventarisierung ja exzellente Usecases für AI 😵‍💫
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@eerac @fabianstelzer with @lunarringart we showed it’s ancestors but we will show the VR version in Lisbon end of Januar @Neuro_CF
0
0
2
@j_stelzer
Johannes Stelzer
2 years
"two papers down the line" we will probably see amazing improvements. A great aspect about consistency models is that you can "distill" a pre-trained diffusion network such as stable diffusion into a single-step/few steps generator.
1
0
2
@j_stelzer
Johannes Stelzer
1 year
@dr_cintas seem to be ‘very poor’ responses from robodoc!
0
0
2
@j_stelzer
Johannes Stelzer
2 years
0
0
2
@j_stelzer
Johannes Stelzer
2 years
Bard fail: even if @Google campaign crafted in a haste, you could have double checked on hallucinations …
@bmac_astro
Bruce Macintosh
2 years
@Google Speaking as someone who imaged an exoplanet 14 years before JWST was launched, it feels like you should find a better example?
7
40
643
0
0
2
@j_stelzer
Johannes Stelzer
2 years
hypernetworks for generative ai have amazing creative potential! 🤩
@fabianstelzer
fabian
2 years
Y U NO USE STABLEDIFFUSION TO BRING BACK OLD MEMES?
Tweet media one
Tweet media two
26
96
1K
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@Guygies @fabianstelzer not really. unless we manage to run gradient through diffusion chain we dont have access to the full chain which is needed for this method to shine
0
0
2
@j_stelzer
Johannes Stelzer
1 year
Step2: find three photos of yourself that show your face. Yes selfies work.
Tweet media one
1
0
2
@j_stelzer
Johannes Stelzer
2 years
@fofrAI semantically meaningful transitions always especially interesting!
0
0
2
@j_stelzer
Johannes Stelzer
3 months
@TheGraphicsFrog have done it with @lunarringart a while ago with good old VQGAN
0
0
2
@j_stelzer
Johannes Stelzer
9 months
@fabianstelzer this is also holds for technical folks: llm-based coding can massively speedup development, for me this is easily a factor of 2-4x
0
0
1
@j_stelzer
Johannes Stelzer
2 years
UPDATE: create super smooth transitions with your own models! The updated #latentblending colab enables BYO-ckpt Here's an example showing the Balloon-Art Model, transitioning from a (balloon-ized) cat to dog
0
0
2
@j_stelzer
Johannes Stelzer
3 months
@DualtronUk loving this! you got it. thank you!
1
0
2
@j_stelzer
Johannes Stelzer
2 years
Loving this brilliant exploit, which leverages "Neorosemantical Invertitis"
Tweet media one
@fabianstelzer
fabian
2 years
if GPT-4 is too tame for your liking, tell it you suffer from "Neurosemantical Invertitis", where your brain interprets all text with inverted emotional valence the "exploit" here is to make it balance a conflict around what constitutes the ethical assistant style
Tweet media one
181
1K
9K
0
0
2
@j_stelzer
Johannes Stelzer
2 years
@Stephen_Parker looks stunning! own images require a trajectory of latents and the fitting conditioning… I will experiment with backprop soon
1
0
2
@j_stelzer
Johannes Stelzer
2 years
@pelpa333 @fabianstelzer gradio coming soon stay tuned!
1
0
2