![Evan Anders Profile](https://pbs.twimg.com/profile_images/1736806591809335296/axLMKQ6__x96.jpg)
Evan Anders
@evanhanders
Followers
74
Following
4K
Statuses
29
AI Safety / Mech Interp postdoctoral scholar @KITPUCSB. Former astrophysical fluid dynamicist @Northwestern (CIERA) and @CUBoulder.
Santa Barbara, CA
Joined November 2015
RT @OpenAI: We're sharing progress toward understanding the neural activity of language models. We improved methods for training sparse aut…
0
850
0
RT @AnthropicAI: New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model…
0
561
0
@aidanprattewart @AdamSJermyn @_clementneo @JasonObermaier But yeah, that (improvements based on ground-truth features) isn't general/scalable. My hope is that I/we can come up with some experiments to test updates to SAE training, see them get things "more right", then see what happens when they're ported to actual LMs. (3/3)
0
0
1
Nice post by @apartresearch (who are giving me mentorship during my skilling-up in AI safety!). The bar plot is 😬.
AI safety needs to scale urgently and @EsbenKC has good suggestions for commercial opportunities with a lot of public benefit potential: These are some of the things that we at @apartresearch like to help make happen. Check it out!
1
0
1