![Shreyas Kapur Profile](https://pbs.twimg.com/profile_images/1717979753318420480/7GPxtAFS.jpg)
Shreyas Kapur
@shreyaskapur
Followers
2K
Following
424
Media
8
Statuses
42
PhD student @berkeley_ai. Previously undergrad at MIT.
Berkeley, CA
Joined June 2012
We managed to get part of our project running in the browser,. Website🌎: Paper📄: Code🖥️: Thanks for my wonderful collaborator @jenner_erik, and advisor Stuart Russell!. n/n 🧵.
6
13
247
I had a lot of fun working on this. I didn't believe that a chess playing neural net could learn to do look-ahead just in its weights, so I was definitely the non-believer in this project.
♟️Do chess-playing neural nets rely purely on simple heuristics? Or do they implement algorithms involving *look-ahead* in a single forward pass?. We find clear evidence of 2-turn look-ahead in a chess-playing network, using techniques from mechanistic interpretability! 🧵
3
15
216
@sdtoyer 😂I'm glad you asked Sam! We've been working on a modern, functional, and performant library for graphics and diagrams in Python called iceberg,.
2
2
60
@anwesh_bh Yes absolutely that's a problem. The tree diffusion approach we propose allows us to collect a very rich dataset of "edits" to train a model. here you can click on "add noise".
2
0
27
@realmrfakename Yes! It can start from a randomly initialized program and edit its way to a target image, guided by search.
1
0
14
@chiaralalalah You should really check out @aaron_lou's excellent work on language modeling,
Announcing Score Entropy Discrete Diffusion (SEDD) w/ @chenlin_meng @StefanoErmon. SEDD challenges the autoregressive language paradigm, beating GPT-2 on perplexity and quality!. Arxiv: Code: Blog: 🧵1/n
1
1
7
@EmilevanKrieken In our current mutation scheme, the expression can get longer or shorter at roughly the same probability, so not sure about the limiting distribution. Anecdotally we noticed that if we noise the program some number of times, the programs resemble just random programs.
0
0
1
@dxwu_ More parameters is the new norm in DL, it's so cool to see theory that pins this down!!.
0
0
2
@EmilevanKrieken I think it has a lot of synergies with GFlowNets (which we mention in the paper) and one of our baseline methods (REPL Flow) is a mix between Ellis et. al. reimagined as a GFlowNet.
1
0
3
@hackpert Considering the studies done to compare convnets with brains, I think convnets do emulate a very small but crucial part of a brain.
0
0
1
@InglfurAri As mentioned, the DSLs used are small. The x, y, w, h values snap to a pretty coarse grid. We also limit the max number of objects that can be placed.
1
0
1