BobQubit Profile Banner
Nils Eckstein Profile
Nils Eckstein

@BobQubit

Followers
351
Following
3K
Statuses
404

AI, Art & Immortality. ML @ https://t.co/4uDpBT0kue, HHMI Janelia | Physics @ ETHZ.

Zürich
Joined November 2013
Don't wanna be here? Send us removal request.
@BobQubit
Nils Eckstein
5 hours
It seems to me that there just isn’t a lot of incentive for compression in LLMs (sure you have some low norm regularizers but it’s not clear to me how this translates to representation compression/MDL of the content after decoding), which seems required/equivalent to finding novel abstractions. Model scale is actually counterproductive here (assuming classical transformer arch). This is also clear from the many papers showing failures/inefficiencies in the algos transformers learn from raw data (eg edge of chaos paper). Many directions here that are under-explored imo.
@dwarkesh_sp
Dwarkesh Patel
1 year
I still haven't heard a good answer to this question, on or off the podcast. AI researchers often tell me, "Don't worry bout it, scale solves this." But what is the rebuttal to someone who argues that this indicates a fundamental limitation?
Tweet media one
0
0
0
@BobQubit
Nils Eckstein
2 days
RT @HHMIJanelia: 📢 Submissions are now open for the CellMap Segmentation Challenge. Build the best method for segmenting cellular organel…
0
50
0
@BobQubit
Nils Eckstein
3 days
RT @JohanWinn: 🚀 We are looking for Image Data Scientists to join our mission to advance connectomics towards whole brain scale for humans…
0
35
0
@BobQubit
Nils Eckstein
4 days
@cppape Good stuff, congrats!
0
0
2
@BobQubit
Nils Eckstein
7 days
If anyone had bothered to read the extensive prior work on unsupervised disentangled representation learning, this crime could have been avoided.
@kzSlider
KZ is in London
8 days
Damn, triple-homicide in one day. SAEs really taking a beating recently
Tweet media one
2
1
7
@BobQubit
Nils Eckstein
11 days
@doomslide Vision still doesn’t actually work & we are lacking strong base models that can do efficient RL for everything but math and coding.
0
0
1
@BobQubit
Nils Eckstein
13 days
@doomslide RL scaling is pure compute though. Not even data constrained anymore, feels like this isn’t factored in properly.
0
0
2
@BobQubit
Nils Eckstein
17 days
RT @Miles_Brundage: Stargate + related efforts could help the US stay ahead of China, but China will still have their own superintelligence…
0
37
0
@BobQubit
Nils Eckstein
17 days
@VictorTaelin In case it helps, my unfiltered thoughts were: Amazing stuff, but can’t integrate it into my current research without significant work. So the risk reward ratio is off. Have to choose between replicating r1 or playing around with this. Gotta de-risk it imo, too much going on.
0
0
0
@BobQubit
Nils Eckstein
19 days
RT @Dorialexander: My main take away of the Deepseek paper is not scientific but organizational: we need an European industrial plan in AI…
0
51
0
@BobQubit
Nils Eckstein
24 days
0
0
2
@BobQubit
Nils Eckstein
1 month
@somewheresy Sounds similar to the general problem of not diverting too much from the base distribution that is addressed in post training (eg rlhf) by staying close to the base distribution. So you could try to regularize with a reference from a base output/model.
0
0
0
@BobQubit
Nils Eckstein
2 months
@JustinLin610 A QwQ paper? 🙏
0
0
0
@BobQubit
Nils Eckstein
2 months
@GarrettPetersen Would recommend at least 1B parameters for realistic wailing.
0
0
0
@BobQubit
Nils Eckstein
2 months
@cloneofsimo Is this really a good way to look at this? Solving a trivial QA in token embedding space also gets you ~zero human accuracy.
0
0
1
@BobQubit
Nils Eckstein
2 months
0
0
0
@BobQubit
Nils Eckstein
2 months
„as we know it“ carries all the weight here.
@_jasonwei
Jason Wei
2 months
Yall heard it from the man himself
Tweet media one
0
0
1
@BobQubit
Nils Eckstein
2 months
@far__el Some don’t, e.g. those who recycle this take in perpetuity.
0
0
0
@BobQubit
Nils Eckstein
2 months
@CRSegerie WDYM? Have you used llama 3? There is no shot that model can do any of the things you are afraid of. This type of fear mongering with zero technical backup is pretty destructive, not good.
0
0
1
@BobQubit
Nils Eckstein
2 months
@signulll The inverse may actually be true. Long term strategic thinking and managing large groups of agents that have short term planning capabilities is plausibly the highest impact position in the new world.
0
0
2