Gytis Daujotas Profile Banner
Gytis Daujotas Profile
Gytis Daujotas

@gytdau

Followers
1,164
Following
245
Media
61
Statuses
449

making computers do my bidding

San Francisco
Joined April 2015
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@gytdau
Gytis Daujotas
18 days
Excited to release a research preview of Feature Lab, a playground where you can sculpt images by editing their features. It uses a SAE to extract the sparse features from an image embeddings model, and allows us to finely sculpt and tweak images.
15
37
271
@gytdau
Gytis Daujotas
2 months
Surprised at the how stunning illustrations in mech interp papers are. It must take so much sweat and toil to not only do incredible and novel research, but to choose to go beyond, and to make it beautiful too.
Tweet media one
Tweet media two
Tweet media three
6
27
323
@gytdau
Gytis Daujotas
3 years
We built a better way to read ebooks in browsers. Just finished processing every Project Gutenberg book on it! Please check it out and let me know what ya think:
Tweet media one
16
66
293
@gytdau
Gytis Daujotas
1 month
By training sparse autoencoders on generative image models, we discover an interpretable palette of world concepts
6
17
230
@gytdau
Gytis Daujotas
2 months
How does self-correction affect problem solving? In a toy transformer model that was trained to solve mazes, I found that performance reliably improved (!) by inserting mistakes and self-corrections into the training data.
14
34
204
@gytdau
Gytis Daujotas
28 days
Model interpretability techniques allow us to understand and control what models do. Here's how Sparse Autoencoders give us general fine grained control on image models:
9
20
191
@gytdau
Gytis Daujotas
1 month
Sculpting an image by mixing and tweaking features - a research preview I'm testing in private beta:
10
14
127
@gytdau
Gytis Daujotas
28 days
In my upcoming work, I trained a Sparse Autoencoder (SAE) on CLIP embeddings and found that they were highly steerable and interpretable. One of the best tests of feature interpretability is to change the feature activations and see what comes out the other end. Beyond labelling
3
2
44
@gytdau
Gytis Daujotas
2 years
Impressive how fast OpenAI APIs are becoming undifferentiated commodities
@scale_AI
Scale AI
2 years
Today at TransformX, we announced a huge step forward for the open source ML community: we are partnering with @StabilityAI to release the first large language model trained with human feedback. 1/4
5
79
492
2
3
36
@gytdau
Gytis Daujotas
28 days
This is exciting in images, because unlike language, images tend to be much easier for humans to parse and understand. Also unlike language models, many edits can be stacked in a way that avoids failure modes like gibberish or noise. These edits can be surprisingly deep, working
1
3
35
@gytdau
Gytis Daujotas
18 days
It’s still very early days. Not all features make immediate sense or are meaningful to humans, e.g. features to do with camera configuration, or the hundreds of features that represent text in the image. For this to work in a product, we’ll need to find ways to sort and filter
Tweet media one
1
5
35
@gytdau
Gytis Daujotas
4 months
Making two Claudes compose a piano piece together:
Tweet media one
4
3
31
@gytdau
Gytis Daujotas
2 months
Building an IDE for interpreting SAE feature activations. Really enjoying the near real-time streaming of activations while editing samples:
1
0
30
@gytdau
Gytis Daujotas
28 days
Blog post and a deep dive coming later this week!
3
0
30
@gytdau
Gytis Daujotas
18 days
Many of the features I found were surprisingly interpretable, and could be added or removed from images while maintaining generation quality and coherence. I found features for tabby cats and men with hats. There’s features for compression artifacts, background settings, and
Tweet media one
1
2
35
@gytdau
Gytis Daujotas
18 days
You can try this research preview today at this link. For more details, there's also an associated deep dive post. (And of course, the weights are open source too!)
Tweet media one
1
1
31
@gytdau
Gytis Daujotas
1 month
@horseracedpast Working on a post now! Will keep you updated :)
3
0
22
@gytdau
Gytis Daujotas
10 months
Moving to San Francisco has so far been excellent. Everyone is kind and open - and I would recommend it, even if only for a while.
1
0
22
@gytdau
Gytis Daujotas
18 days
Compared with standard prompt-based control, this allows for exciting user interfaces where a palette of concepts can be tweaked and added. It allows us to both make very fine adjustments to images at varying levels of granularity. And even more importantly, we get to understand
1
1
22
@gytdau
Gytis Daujotas
2 months
Daily gratitude for being in a world where scrolls that look like this can actually be read through the sheer will of human ingenuity
Tweet media one
Tweet media two
0
1
17
@gytdau
Gytis Daujotas
17 days
Wonderful writeup! Really enjoyed playing with the Prism demo too - had a lot of fun doing precise semantic editing of language. Very intriguing possibilities!
@thesephist
Linus
17 days
🌈 Research blog: Text embeddings contain semantic and structural features that we can now automatically discover with SAEs, and use to build rich, interactive new information interfaces. In this write-up on my SAE experiments and a prototype called
Tweet media one
Tweet media two
Tweet media three
Tweet media four
10
63
473
1
1
16
@gytdau
Gytis Daujotas
18 days
In no particular order, thanks @DavidMcsharry3 , @MatthewWSiu , @thesephist , @mehranhydary , @maxkriegers , @NoaNabeshima , and many others for the feedback, advice, and support :)
1
0
17
@gytdau
Gytis Daujotas
28 days
There's a couple of advantages compared to interpretability on language models. In SAEs trained on transformers, the sparseness penalty discourages features that code for the overall shape and structure of the work. In this domain, image embeddings contain all the information in
Tweet media one
2
1
14
@gytdau
Gytis Daujotas
28 days
This prompt-free steering can help us discover more about what's inside our models, and have industrial applications in providing new forms of user interfaces for humans to use. We can also use features to apply human-composed rules from a few examples, like this guardrail that
1
1
13
@gytdau
Gytis Daujotas
2 months
You can read more of the writeup on my blog (in my bio)! Thanks @DavidMcsharry3 , @NeelNanda5 , and others for early reviews of this work.
2
0
10
@gytdau
Gytis Daujotas
2 months
This IDE streams from a cloud GPU using a SAE that I've trained on a NanoGPT instance that learned to solve mazes. I've started by working on transformers that play simple games - to figure out what techniques might transfer over to games that we understand imperfectly, like
1
0
7
@gytdau
Gytis Daujotas
2 months
I trained probes on the model, and found the residual stream has some representation of mistakes in the residual stream, but not linearly. Future work could focus on other ways of levering down the mistake-making "feature" to make a model which is better all-around.
Tweet media one
2
0
8
@gytdau
Gytis Daujotas
4 years
@BrendanEich @ivanbuncic @runhappylife @getify That's Javascript Weekly, though it probably should instead be called JS Weekly.
0
1
8
@gytdau
Gytis Daujotas
17 days
Wonderful documentary on Robert Caro and Robert Gottlieb, his editor, and how they have worked together, quarreling and waging wars against one another, for over 50 years:
0
2
8
@gytdau
Gytis Daujotas
3 years
Over 200k words in my journal and counting. Next step is to train a language model, and start with a prompt like "Journal, today I discovered something incredible..."
1
0
7
@gytdau
Gytis Daujotas
17 days
@sksq96 Try visiting :)
0
0
7
@gytdau
Gytis Daujotas
2 years
working on generated interfaces!
Tweet media one
1
0
7
@gytdau
Gytis Daujotas
4 months
Enjoyed this post from Daniel Dennet: "These days I almost always outsource the hard work of comprehension when I encounter difficulties, and the policy works wonders—​for me. Distributed understanding is a real phenomenon, but you have to get yourself into a community of
1
0
6
@gytdau
Gytis Daujotas
2 years
Working on better ways for people to use language models. Here's an interface that allows you to backtrack and wander through the space of creative possibilities:
1
0
6
@gytdau
Gytis Daujotas
7 years
Thanks @adeoressi for your invaluable advice at @SWDub #SWDub !
0
0
6
@gytdau
Gytis Daujotas
5 months
The previous Fitzwilliam meetup in Dublin was wonderful, and now -- look! -- an opportunity to attend the next in the series:
@Sam__Enright
Sam Enright
5 months
I am hosting a meetup next Friday the 23rd at 7:30pm in Parnell Heritage Pub in Dublin – hope you can make it!
Tweet media one
1
6
21
0
1
6
@gytdau
Gytis Daujotas
27 days
@thesephist Agreed! I am curious how we can get back some of the high-level features in autoregressive transformer LLMs. I am pleasantly surprised how easy embeddings are to manipulate, e.g. quite easily supporting hundreds of feature edits. I have a hunch that important features (like
0
0
5
@gytdau
Gytis Daujotas
3 years
Bored of having to say 'social media platforms' every time. Cooler, piratey sounding name: the shallows. Has mystique, sense of danger, lawlessness
1
0
5
@gytdau
Gytis Daujotas
4 months
@dogmadeath Here's a hilarious example of this, where two Claudes were composing music, and Claude 2 got a little bit too into it for Claude 1's taste:
Tweet media one
1
0
3
@gytdau
Gytis Daujotas
2 years
This is a 1-meter sized mirror used to make chips. If you were to blow this mirror up to the size of the Earth, the largest aberration would be the width of a human hair. (!!!)
Tweet media one
1
1
5
@gytdau
Gytis Daujotas
3 years
Wow: "The world’s largest, most valuable tech companies are dependent either directly or indirectly on the steady output of TSMC’s fabs. If [they] went offline [...] it would immediately devastate the global economy. "
2
5
5
@gytdau
Gytis Daujotas
5 months
@andy_matuschak A good friend recommended this high CRI floodlight, which I much prefer to the regular cheaper ones on Amazon:
Tweet media one
0
0
5
@gytdau
Gytis Daujotas
2 months
One small thing so far: top activations can be surprisingly deceiving! One example: this feature seems like it represents first moves, but it actually encodes location information too. Maybe still in superposition?
Tweet media one
Tweet media two
1
0
4
@gytdau
Gytis Daujotas
8 years
Well there goes my diet plan. Pizza at #StuSWDub @StuSWDub
Tweet media one
0
0
4
@gytdau
Gytis Daujotas
3 years
@atroyn @ArtirKel Three Body Problem is stunning, but I'm not sure if optimistic immediately comes to mind
2
0
4
@gytdau
Gytis Daujotas
3 months
Two instances of Claude collaborating on a Chopin-inspired barbershop tag:
1
1
4
@gytdau
Gytis Daujotas
2 years
@minney_cat @thesophiaxu @lisawehden +1 to this, Lisa and Minn are great!
0
0
4
@gytdau
Gytis Daujotas
10 months
@mark_cummins If so, I wonder if world models are a related area of research...
1
0
1
@gytdau
Gytis Daujotas
3 years
Something that has been unexpectedly good: getting a very cheap thinkpad, disabling the wifi drivers, and using it exclusively to journal. Makes getting into the flow of writing mysteriously far easier than using any other computer!
0
0
3
@gytdau
Gytis Daujotas
4 years
Keep burning myself by switching to promising but immature software. I'm going to just switch back to text files for note taking and see if I can make a more stable workflow work.
0
0
2
@gytdau
Gytis Daujotas
3 years
@tszzl @nickcammarata Maybe the visual system is going crazy spamming every primitive (edge, corner, colour) at every position, which is then interpreted by the higher order face detectors, explaining why the faces look so weird and impossible. But why faces out of all the other objects...
3
0
3
@gytdau
Gytis Daujotas
9 years
My @livecodingtv T-shirt just arrived. Pleasantly surprised, I thought it was just going to have their logo. Neat! http://t.co/bi1Xczm94G
Tweet media one
1
2
2
@gytdau
Gytis Daujotas
1 month
@kliu128 They're everywhere! Some easy tells, from previous research on this problem:
Tweet media one
0
0
3
@gytdau
Gytis Daujotas
2 months
This counterintuitive performance gain remains even at varying model sizes, and remains even though the model has to pay a penalty for learning to make mistakes.
Tweet media one
1
0
3
@gytdau
Gytis Daujotas
3 years
woo, go teams! Can't wait to see what they've built
@joinpatch_
Patch
3 years
🚨 Announcing: Patch Demo Day Join us over lunch on Wednesday, August 18th to see what our latest batch of teams has been building! Register here:
Tweet media one
3
5
22
0
0
3
@gytdau
Gytis Daujotas
2 years
Nobody (not even your doctors) tells you that you can pay $20 to get access to the forbidden medical diagnostic knowledge to actually figure out what's going on with you for all known ailments and problems:
0
0
3
@gytdau
Gytis Daujotas
8 years
Oh, neat! I'll try my best to come!
@SWDubUNI
SW Dublin University
8 years
Calling all students, #StuSWDub is back! 54hrs of startups, pizza & great craic ~ 4-6 Nov 2016. More info -> [ RT! ]
Tweet media one
0
10
12
1
0
3
@gytdau
Gytis Daujotas
3 years
Reading translations of the classics side-by-side makes the experience feel far more rich - we should have far more tools like this:
1
1
3
@gytdau
Gytis Daujotas
7 years
I just published “5 starting steps to clean code”
0
2
3
@gytdau
Gytis Daujotas
7 years
@CaseyGames Hey Jordan, do you want a logo for that podcast? I can help with that 😛
2
0
2
@gytdau
Gytis Daujotas
8 years
Excited to do the final pitch! Hopefully the laptop won't catch fire in the middle of the presentation 😁 #StuSWDub @StuSWDub
Tweet media one
0
3
2
@gytdau
Gytis Daujotas
9 years
@mscccc @ProductHunt @jacqvon Sweet! How did you choose the emoji?
1
0
2
@gytdau
Gytis Daujotas
1 year
When thinking about reliability, demos live in the 1st percentile, deployments live in the 99th percentile. Usually not much in between is useful, and the difference between the two can be surprisingly large
1
0
2
@gytdau
Gytis Daujotas
7 months
Colab is an excellent product. Something like Colab could be an entry point to cloud IDEs for the masses - with all the perks that brings (less env setup, faster artefact downloads, customizable machines, etc)
0
0
2
@gytdau
Gytis Daujotas
5 years
@bmann @vgr I'm more a Dynalist guy myself, but I am finding it's not great at representing a knowledge graph. I don't think transclusion is the main selling point for me as much as the wiki style linking.
0
0
2
@gytdau
Gytis Daujotas
10 months
@mark_cummins Profound question! Could it be that in the effortful reading style you are testing your 'world model'? i.e. when reading maths with effort, you have a notepad by your side and test your understanding via examples, counter-examples, answering your own Qs?
1
0
1
@gytdau
Gytis Daujotas
8 years
Just back from @CoderDojo 's brilliant community hackathon at @dogpatchlabs . Had a great time! 😀
0
2
2
@gytdau
Gytis Daujotas
9 years
I completely forgot about these! Thanks @notifuse :)
Tweet media one
1
1
2
@gytdau
Gytis Daujotas
8 years
The three food groups of an entrepreneur at @StuSWDub #StuSWDub
Tweet media one
0
2
2
@gytdau
Gytis Daujotas
4 years
@adamkelly2201 You tell em Adam. Let's just do away with the entire concept
1
0
2
@gytdau
Gytis Daujotas
2 months
@thesephist @DavidMcsharry3 @NeelNanda5 That’s a cool idea! We’ll need to label all the mistake moves in the training data, which is possible in a toy model, but difficult to apply in large internet scraped datasets. It would be so cool if you only needed a handful of examples of mistakes to train a probe or extract
0
0
1
@gytdau
Gytis Daujotas
7 years
@LittleJazzHands We're building a pomodoro time tracker for creatives. We'd love to have you as one of the first beta testers 😁
1
0
1
@gytdau
Gytis Daujotas
3 years
Looks like machine learning is going well
Tweet media one
1
1
2
@gytdau
Gytis Daujotas
8 years
Currently at #DojoCon nabbing all the free merchandise 😂 with @ChungWung @OisinODuibhir @ciaranpflanagan
Tweet media one
0
1
2
@gytdau
Gytis Daujotas
4 years
Even if you had no other information, knowing that McDonalds, an organisation which cares deeply about staying open and making more money, has chosen voluntarily to close down for an indefinite period of time should concern you greatly.
0
0
2
@gytdau
Gytis Daujotas
4 years
DCU is going to switch all of our exams to alternative assessments. This will be entertaining to see for sure.
0
0
2
@gytdau
Gytis Daujotas
1 year
Any examples of ML researcher + product builder + CEO being all the same person? Maybe a unicorn, but have heard this life plan from more and more friends!
0
0
2
@gytdau
Gytis Daujotas
3 years
@TheOisinMoran Good luck Oisin! What's the motivating factor for dropping the coffee, may I ask?
1
0
2
@gytdau
Gytis Daujotas
4 years
This is an excellent opportunity to get very valuable creative work done with little physical distraction for 6-18 months.
0
0
2
@gytdau
Gytis Daujotas
3 years
If the next few decades come with fast AI progress, then we can expect text/image generation to get really good, which will probably do something really weird to the way we experience social media
1
0
2
@gytdau
Gytis Daujotas
3 years
In ebooks vs printed books, a big plus for printed books is that you can write notes on the paper which keep the context of your notes. Seems like a trivial issue to solve in software, but have not yet seen anywhere near a universal and adequate solution.
0
0
2
@gytdau
Gytis Daujotas
2 months
@coffee_pot Great question! A previous paper serialized mazes using spans, but I'm not entirely sure why they did that. I found that serializing it with ASCII in a human interpretable way, just one character per tile, worked well enough for my purpose. Maybe it's not the best maze solver
0
0
2
@gytdau
Gytis Daujotas
6 months
The Socratic Method (2021, Farnsworth) is, nominally, a defense and guide to a well-respected yet rarely-practiced technique. But it is also a capable defense of serious thinking and intellect. One of the better books of the genre - highly recommended.
0
0
2
@gytdau
Gytis Daujotas
10 months
@Sam__Enright @harsehajd I just moved, so we can be clueless together!!
0
0
2
@gytdau
Gytis Daujotas
4 years
If aliens came down to earth to design user interfaces they'd add interaction sound effects. All of our apps are unnecessarily quiet
0
1
2
@gytdau
Gytis Daujotas
3 years
Tweet media one
0
0
2
@gytdau
Gytis Daujotas
8 years
In the near future...
Tweet media one
0
0
2
@gytdau
Gytis Daujotas
8 years
@CaseyGames Hey, what's after happening with @kidscodegame ? I was excited to test it out 😄
1
0
2
@gytdau
Gytis Daujotas
8 years
Tweet media one
0
0
2
@gytdau
Gytis Daujotas
1 year
People can avoid fatal crashes in 99.999999% of miles driven -- really sets a high bar!
0
0
2
@gytdau
Gytis Daujotas
4 years
They should reduce capacity on public transport so that a whole bunch of people aren't standing huddled together, even if it means many will have to wait longer for another bus to arrive.
0
0
2
@gytdau
Gytis Daujotas
7 years
We're getting hints on how to win the competition through intense memery at @SWDub #SWDub
Tweet media one
0
0
2
@gytdau
Gytis Daujotas
4 years
@FusorFusion On further calculations, it's maybe only like two or three bags. Still, a substantial portion.
0
0
1
@gytdau
Gytis Daujotas
9 years
@Abhi__Khatri @travisneilson Damn it, I just signed up but I guess I won't be able to read it. Pretty disappointing. :P
1
0
1
@gytdau
Gytis Daujotas
8 years
Making random #nodejs garbage again. #wednesday
Tweet media one
0
0
1
@gytdau
Gytis Daujotas
2 months
@michaeljelly Agree, very Deutschian! In my blog post I explore how the “fallible” model learns to use the information locally to its position on the board to figure out if it just made a mistake or not. In a way, this form of error correction seems to approximate search, by trying stuff out
0
0
1
@gytdau
Gytis Daujotas
3 years
@nickcammarata @gwern Tried getting myself addicted to meditation by using nicotine at the same time, and had some but limited success - did using nicotine and working out help you?
1
0
1