![neuronpedia Profile](https://pbs.twimg.com/profile_images/1686887330987245577/-w1ftuZ3_x96.png)
neuronpedia
@neuronpedia
Followers
377
Following
13
Statuses
14
e/interpretability š§ š§
sparse autoencoders
Joined July 2023
š§ Sparse Autoencoders (SAEs) are a popular way of discovering what an AI model knows. But how do you measure how "good" an SAE is? š Introducing SAEBench and an interactive explorer: The SAEBench project by @a_karvonen and @can_rager is a suite of evals that compares SAEs across classes, widths, sparsities, and more. Go check it out!
0
2
10
RT @a_karvonen: Our suite enables researchers to rigorously evaluate SAEs across multiple dimensions. We discuss a few below. For more detā¦
0
2
0
RT @a_karvonen: Sparse Autoencoders (SAEs) are popular, with 10+ new approaches proposed in the last year. How do we know if we are makingā¦
0
22
0
check out our awesome feature in @techreview, then go talk to cat gemma:
0
1
5
RT @JBloomAus: 0/8 Iām super excited about work done by my LASR scholars @chanindav, @TomasDulka, @hrdkbhatnagar and James Wilken-Smith. Thā¦
0
12
0
RT @lieberum_t: Extremely excited to finally get this into people's hands! Huge achievement by the whole mechinterp team @GoogleDeepMind!ā¦
0
3
0
RT @NeelNanda5: In our Gemma Scope release of open Sparse Autoencoders, I LOVED the interactive demo SAEs are like a microscope, breakingā¦
0
11
0
@YeshuaGod22 @NeelNanda5 thanks for reporting these. looks like you were searching RES-JB in GPT2-Small, not Gemma Scope features. if you click into a feature, it will tell you the LLM that generated that explanation. eg the first feature in the first screenshot was gpt-3.5-turbo.
1
0
2
steering ai is an imperfect art. that's what makes it fun.
Gemma Scope allows us to study how features evolve throughout the model and interact to create more complex ones. Want to learn more? Hereās an interactive demo made by @neuronpedia - no coding necessary ā
0
2
8
RT @johnnylin: exciting new research from @apolloaisafety and @jordantensor: E2E SAEs (w/ ~700k features) are now live on @neuronpedia - thā¦
0
3
0
RT @johnnylin: Terrific work by @saprmarks and team! š„³ We really enjoyed working with them to get their Sparse Autoencoders onto @neuronpedā¦
0
1
0
RT @johnnylin: 1/ Introducing Neuronpedia: an open platform for interpretability research with hosting, visualizations, and tooling for Spaā¦
0
29
0