Playing with CLIP embeddings of clips (sequences of images)
Each row represents a (CLIP embed of) webcam capture
This uses
@xenovacom
's transformer.js to run CLIP in the browser - allowing interactive client side experiments.
Adding a mousemove to allow me to scrub over the
@mattdesl
I'm super excited to see what generative artists / javascript hackers build integrating with things like this..
I built a quick prototype using
@p5xjs
You draw on the canvas - and it re-runs
@SimianLuo
's LCM after mouseup...
I have a little more info here:
Exploring creative tools around text SAEs - fun colab with
@MatthewWSiu
as part of summer residency at
Inspired by
@graycrawford
/
@thesephist
Visualizing a grid of active features for each token, letting you scrub over them to
I ported
@bruno_simon
's "my room in 3d"
@threejs
demo to
#webxr
#VR
The scale is hilarious - you are scaled to 2' tall!
It felt like Alice In Wonderland
And made me wonder: do toddlers experience the world at a different scale, or a different height?
After reading and thinking hard about the first four chapters in
@rasbt
’s transformers book, I thought I was getting somewhere.
Then last night I listened to
@NeelNanda5
“what is a transformer pt 1” -
In it he says on logits: “for every prefix of the
I hacked
@huggingface
transformers library to let me LERP between prompts :)
prompt_1: Black whippets are
prompt_2: White whippets are
mix p: 0.5
completion: also known as "snow whippets"
Unclear if this is useful but after reading about transformers / watching
@karpathy
-
A quick compare tool of SDXL vs Lightning vs Turbo
Lightning resembles the output of SDXL much closer than Turbo.
The 8 step model is much more useful than 1 step
Pre-generation of:
- 4 different seeds
- 1632 prompts (from Parti Prompt dataset)
- 6
Comparing
@KaiberAI
and
@runwayml
video alteration, I **really** like that Kaiber gives you a preview of the effect before it spends all the time/credits to generate the video
Kaiber AI just released a new feature where you can turn your video into a magical animation. Perfect for music videos, NeRFs, Blender scenes without textures, etc. I made a quick super beginner tutorial:
Not a surprise to
@doctorow
readers (read Radicalized / Unauthorized Bread)
we buy laser toner cartridges directly from
@hp
to minimize issues with counterfeits
It suddenly stopped printing, even b&w
Supply Status report prints colors as "used". Support says replace them all!
Excited to build this cute little robot - I'll try to convert it to using the
@microbit_edu
instead of m5stack core (
#esp32
based device)
I purcahsed the model here:
I awoke realizing my idea was backwards
Instead of mapping my movements to a robot then sending video of the robot on zoom as my avatar, I could map movements of each person on the call to a robot and have no screen meetings
Zoom fatigue solution for
conference call 😂
We are already having lots of fun exploring fine-tuning of
#sdxl
on
@replicatehq
This WIP by
@cloneofsimo
really captures the essence of
@zeke
at work
Looking forward to both enabling both fine-tuning and LoRA
How the
#SDXL
video
@zeke
shared yesterday was made:
Idea: if changing the seed changes the generated image what happens if you use the same seed but translate the initial noise latents before diffusion?
It is cool how the texture of wall, shirt and even the hair move with the
"Persistence of vision display" or POV display are LED devices that compose images by displaying one spatial portion at a time in rapid succession. One example are Hologram Fans
the new
@diffuserslib
0.15 release has too much goodness to fit into a single tweet... Adding to what
@RisingSayak
shared:
Faster load time:
Previous: 2.27 sec
Now: 1.1 sec
Very clean API for Multi-ControlNet:
from_pretrained(model_name, controlnet=[canny, posenet])
🧨 diffusers 0.15.0 is out, and we have taken it beyond the world of just image generation 📽🎙
Summary:
📽 Video generation pipelines
🎙 Audio generation pipelines
📈 New training scripts with improved support
📜 New docs
💫 Feature improvements
Deets:
Printer vendors are hostile to customers - all in the name of protecting me with a "Anti-Counterfeit and Fraud (ACF) Program"
I want a "counterfeit" printer.
I can buy a laser cutter from
@AliExpress_EN
Is it time to buy a laser printer from them as well...
I'm generating 1632 "parti prompts" using Playground V2.5 and SDXL
Many of these prompts are challenging concepts/compositions.
I'm generating 4 different versions of each prompt, but after the first completed, it looked so interesting I wanted to share
I've been prototyping how to build "tiny apps" (cc.
@schingler
/
@jmckenty
)
Idea: make it so small/simple apps don't need infrastructure - but work with multiple users, offline, syncing between devices, no deploy.. localfirst "firebase"
Still early - using automerge datastore
Moving the mouse to change which 4096 puppy images is displayed.
Using a single seed with
@SimianLuo
's Latent Consistent Diffusion but rolling the original noise latents (shift with wraparound).
It is fun to see the puppy kinda move left / right in time with the mouse.
Using
@junkiyoshi
's creative coding as a control for iterative stable diffusion :)
(I think I still prefer their original...)
Each frame:
- uses a frame from
@junkiyoshi
's animation as the control
- img2img with previous image (with a slight zoom)
- travels slowly between
I'm having so much fun with
@fofrAI
's work on making
@SimianLuo
Latent Consistency Model fast and easy to run on my macbook 🚀
Works local + building the tool in
@p5xjs
whoa! Amazing hire by
@Replit
Given
@szymon_k
prior work and
@Replit
's work in everything from streaming GUIs over the web, ml coding models, or super simple "datastores" for writing simple persistent webapps...
There is a lot of amazing potential
LERP between the SAE clip features of two images based on
@gytdau
awesome work -
i1 = load_image("")
e1 = embed_image(i1)
f1 = model.encode(e1)
i2 = load_image("")
e2 = embed_image(i2)
f2 = model.encode(e2)
Excited to release a research preview of Feature Lab, a playground where you can sculpt images by editing their features. It uses a SAE to extract the sparse features from an image embeddings model, and allows us to finely sculpt and tweak images.
Really awesome tutorial about
@hundredrabbits
generative music sequencer (meaning you write code and it creates sound)
#ORCA
:
After reading the contents of the left side, you can click into other tutorials via the header menu
Prototyping an app to capture / interact with thoughts
- incredibly-fast-whisper for transcription to notes
- llama3 (and chatgpt/claude?) to answer questions
My goal is a no-screen device to capture ideas during walks / middle of the night / when inspiration strikes. Using
I love
@Wattenberger
's Getting creative with embeddings aka "Yay Embeddings Math"
Has anyone done similar things for images?
One inspiring project is river by
@maxbittker
My mac isn't the latest, but even 2s per image, as we get to 8fps, it opens up entire new worlds of exploration of latent spaces
this is img2img with whatever prompt is in the html input whenever the picture starts generating
More progress on the tiny robot
I printed everything(using 0.2 profile on
#cura
). I had to do a lot of sanding / shacing to make fit
While I’m waiting on the rest of the screws/bolts I was able to test the bottom servo mechanism using the
@microbit_edu
radio + accelerometer
Getting great results from the Consistency Decoder (VAE) released by
@OpenAI
as open source today.
It does take an extra second or so.
The
@diffuserslib
is already working on integration!
I've created a model on
@replicate
that lets you compare (you can specify the seed and
Not only is
@fermat_app
doing amazing work creating new ways of building AI tools:
They have also shared a controlnet + sdxl + lora model on
@replicate
!
Here is
@fofrAI
's barbiediffusion lora with canny control:
Create your own AI Tools in Fermat 🔨
Only you know your unique needs. That’s why in Fermat you can create your own AI tools! 😯
Let’s take a quick look at how to do that using Gens 👇
git push origin add_jesse
It is official. I’m joining
@replicatehq
I fell in love with cog (their open source project to build/run ml containers). And now I get to join the amazing team building it full time!
@minimaxir
yeah!
@mattt
blogged about it here
Ha! I remember
@simonw
tweeting about logit bias and how it works with a notebook - turns out it was yours!
Perhaps I'm wrong here, but I think about grammars as logit bias ++
@swyx
I'm looking forward to having a "Greasemonkey" like extension to un-sensationalize the news / standardize the passive/active voice in news
Can we use LLMs to remove some bias (or at least identify it)
I got the last servo in the mail today!
Waiting on sandpaper and grease to help smooth out the vertical servo motion. And I need to figure out power for the
@microbit_edu
servo driver from
@waveshare00
Other than that it is time to focus on the software side
Writing up how I'm exploring latent spaces of SDXL Turbo...
While it is fun to play with this latent space (try replacing `prompt_embeds` with `torch.randn_like(prompt_embeds)`), what
Copilot suggests "what's the point?"
🤣
No copilot, "what I really want to do is ..."
Testing all the servos!
Using
#chataigne
to send a sin wave to servos one at a time
Still running into problems with using more than one servo at a time. I’m going to try
#micropython
on the
@microbit_edu
- perhaps it will be fast enough? Or I might need to switch to
#esp32
Playing Connect 4 with my youngest daughter, we were discussing variations of the game.
It lead to a quick design in
@tinkercad
and now the
@BambulabGlobal
is
#3dprinting
this 3d column wrap-around version of connect 4
PCA of PHI-2's token embeddings (I'm on a plane and I didn't bring my llama)
I'm trying to build an interactive version of while flying to be with the rest of
@replicate
in SF
I think there's maybe really something here, look at this. I got GPT to respond to the prompt from this . Then I got a sequence of embeddings for each word in both the human- and GPT- authored essays. You can see how the human one moves around more.
The Mac reports says "!" / "Other"
And never finishes printing even B&W documents.
I have to throw away 3 color toner cartridges that cost over $100 each because of DRM bull...
I've had a lot of fun exploring watercoloring on
@EMSL
#axidraw
#plottertwitter
I'm using custom paths generated via JavaScript/paperjs. Refilling the brush periodically by returning to origin where the paint is located.
This is a month of tidal data for bolinas
the umap process results in some beautiful structures
"A very pretty structure emerges; this might be spurious in that it captures more about the layout algorithm than any “true” structure of numbers. However, the visual effect is very appealing and
"computational thinking"
"iterative design"
???
the process
and other
@websim_ai
creators demonstrate is a skill and "computational thinking" doesn't seem to be quite the word.
To get the most of these genai software systems,
I've been training a VAE to go from clip embeds down to 2 dimensions.
The idea is to let you explore a large set of embedded images by creating an autoencoder to go from 2d (a map in X/Y) to 512 dimensions...
Now that I have journeydb embedded, I can explore other ways of
The oldest started building an interactive dance installation with
@makeymakey
and
@scratch
tonight
Now she is suggesting we need less windows for her projection
Whoa 🤯 getting 6 different hallucinated views of an image within 12 seconds
Finally I know what the back of "thinking face" looks like
This works for more than just emoji 🤣
I really like the ideas & momentum at
@RoamResearch
Roam/Conor unapologetically blunt tweets have caused me concern w/company. (plus having 2nd brain in a closed platform on hosted cloud service w/scale issues)
this podcast helped me understand more
After 6 chapters of Donella Meadows "Thinking In Systems"
❤️❤️❤️
"In the end, it seems that mastery has less to do with pushing leverage points than it does with strategically, profoundly, madly, letting go and dancing with the system."
❤️❤️❤️
Success?
@BeakerBrowser
-
#hyperdrive
-> nodejs -http+websockets-> live updates in firefox/safari/chrome
GOAL: share live content with http users - without the content knowing about differing methods
Previous old school multipart hack was too brittle -
today's
#FAIL
a quick POC of streaming changes made in
@BeakerBrowser
to a
#hyperdrive
over http using ancient "multipart/x-mixed-replace" on the index.html.
whenever a "watch" on the drive triggers (contents of drive changes) a the node app sends a new "page"
works, almost
Finishing up an application to hopefully spend some time learning / exploring at this summer
Really wishing most of my exploration over the last year wasn't a hodgepodge of tweet threads ... I should have done as
@simonw
recommends and POSSE : Publish (on
how you know your wife is CEO:
kiddos are pretending their stuffed animals are in a board meeting
I've been drafted to be the secretary
unfortunately I am under NDA so I can't share more about what is happening without triggering a timeout
reading details of how transformers work … having spent years using them … is an odd experience
It makes the fact that it works even more 🤯
Oh. Let’s just add this positional encoding. By literally adding it to token embeds
Perhaps this will make sense eventually. But it
A single
#dreambooth
model on
@replicatehq
using latest controlnet-1.5-openpose template can generate 4 variations:
ControlNet: yes/no
img2img: yes/no
prompt: "portrait of cjw by van gogh"
seed: 42
See - built from patch from
@SabatinoMasala
🙏
Similar to
@voooooogel
creating weights by hand… (awesome blog post!)
I wonder if a transform fine tuned on “hop on pop” level books could be small enough to have an interactive visualization of the circuits / activations / weights
Or perhaps scaling
@thesephist
@mayfer
@MatthewWSiu
@graycrawford
I did a naive experiment with
@gytdau
's SAE - finding pixel art features (to search / direct)
steps:
1. take a 2 very different pixel art style images
2. get top SAE features of clip embed
3. find shared features + magnitudes
then I was able to go into
I put an apple homepod next to the bed - now I can say in the middle of the night:
"hey siri remind me: X Y Z" and then I can go back to bed without opening a computer to write something down...
Now I have lots of reminders like:
- Create a feet of all examples
- Create a
Watching the final few papers that
@johnowhitaker
walked through in
The last paper was DSPy -
One of the techniques in DSPy is finetune is made for a predictor based on data generated earlier.
I talked last week with
@JoeEHoover
Thanks to
@agirisan
for recommending open source program Chataigne to control robot
A simple forever loop reading servo commands from serial line on
@microbit_edu
, and Chataigne can send serial commands based on a sequence (curves)
Adding a 2nd sequence for leg fails. Tomorrow?
"realtime" CLIP (or siglip) image embeds in the browser thanks to transformerjs by
@xenovacom
Steps:
- create a video element, ask for camera, attach to video (you can css hide if you want)
- use requestAnimationFrame to grab the a frame from video and process/embed it
-
My youngest daughter gives amazing hugs.
I asked her for a hug because it has been a hard week. Afterwards she said I love you and asked if had lunch
She is so smart. 7 years old and better at life than I am
"realtime" CLIP (or siglip) image embeds in the browser thanks to transformerjs by
@xenovacom
Steps:
- create a video element, ask for camera, attach to video (you can css hide if you want)
- use requestAnimationFrame to grab the a frame from video and process/embed it
-
Exploring hypermerge by
@hirodusk
,
@pvh
and others at
@inkandswitch
They created "pushpin" as a way to hack on the platform -
Playing with youtube (idea: add comments as you watch with deep hyperlinks to time in video)
Connecting
@p5xjs
with
@replicatehq
Clicking the canvas captures a frame ->
#sdxl
img2img
Currently doesn't support cors, so I can't use the official javascript library or endpoints...
I made a
@Replit
nodejs proxy to add auth & cors...
The notebook is pretty fast to run on colab! After installing deps / downloading SDXL base weights, each image took roughly 1-2x the normal SDXL inference time (I tested on the A100)
Playing with this, I thought .. "hmm, I wonder if we can use inversion to edit existing images"
Is it a bird??
Going through
@jeremyphoward
's
#fastai
course again. Now that I work at
@replicate
I'm deploying it on replicate..
It currently it a little painful to build on google colab and then push to replicate, but I'll work on better ways to do this as I work through the
Playing Connect 4 with my youngest daughter, we were discussing variations of the game.
It lead to a quick design in
@tinkercad
and now the
@BambulabGlobal
is
#3dprinting
this 3d column wrap-around version of connect 4
Every few years, Tokyo is kissed by rain that drips neon light. An urban enigma, wrapped in spectral hues, it’s the city’s glowing secret whispered to the night.
#zeroscope
followed guide from
@fofrAI
-
Seems like just yesterday I was asking you to come help us taking space selfies.
It was great working with you again.
It is particularly awesome you went to Basecamp. I owe a lot to
@dhh
and the rest of the rails ecosystem
This
@StabilityAI
SVD of a painting of a wave worked pretty good
I find a majority of the more artistic/stylistic initial images results in a boring pan/zoom of the image with no movement
@simonw
The existence of PUT, PATCH provide the benefit of reminding me to take a deep breath and pause and reflect on life.
before diving into api docs just to determine how this site decided to implement create vs update vs partial update...
I played with this prototype by
@mattdesl
during my flight. Really neat how well it fits together.
Svelte was easy enough to understand to tweak to add repetition and a randomize button
Evaluating Svelte3 as the backend for a canvas-sketch editor, its amazing how easy it is to build out complex UIs in a few small files.
Definitely feels like the framework of the future.
Great fun today talking with
@genmon
at unoffice hours.
I have been sick and so I was anxious I wouldn’t have anything to talk about… there were many stones left unturned (we didn’t even talk about poem or other interesting ideas and projects)
I am inspired to start doing the