Zack Ankner Profile
Zack Ankner

@ZackAnkner

Followers
1K
Following
985
Statuses
343

Senior @MIT. President of AI@MIT. Research Scientist Intern @DbrxMosaicAI.

Joined September 2019
Don't wanna be here? Send us removal request.
@ZackAnkner
Zack Ankner
6 months
Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly!
13
57
249
@ZackAnkner
Zack Ankner
13 days
If I got aspartame poisoning, I wouldn’t tell anyone but there would be signs
Tweet media one
@ZackAnkner
Zack Ankner
17 days
2 days, 18 cokes, 216 fluid ounces later, the second layer has been assembled .
Tweet media one
0
0
14
@ZackAnkner
Zack Ankner
16 days
If a turing machine writes to tape and no one is around to read the tape, is it really turing complete
1
0
6
@ZackAnkner
Zack Ankner
17 days
2 days, 18 cokes, 216 fluid ounces later, the second layer has been assembled .
Tweet media one
@ZackAnkner
Zack Ankner
19 days
Aspartame maxxing
Tweet media one
0
1
22
@ZackAnkner
Zack Ankner
18 days
RT @rajammanabrolu: V cool to see that Kimi has taken and scaled our CLoud paper to do better reward modeling through extra inference time…
0
3
0
@ZackAnkner
Zack Ankner
18 days
If you're also sick of quiet reward models can check out the code here:
0
0
2
@ZackAnkner
Zack Ankner
19 days
Aspartame maxxing
Tweet media one
0
0
13
@ZackAnkner
Zack Ankner
20 days
@code_star Line go up, line go down, doesn't matter. I'm sitting there staring.
0
0
3
@ZackAnkner
Zack Ankner
1 month
Are there any good writings on how people will purposefully misuse AGI. Something along the lines of the first thing a government that has AGI would do is use it against a foreign power. Trying to think about safety cases when models will purposely be used for misaligned tasks.
1
0
2
@ZackAnkner
Zack Ankner
1 month
Interesting question is how much is sequential prediction needed vs parallel prediction like Meta ( . Also some interesting schemes for very future token prediction, like predicting the average hidden states over some chunk of tokens very far in the future.
0
0
3
@ZackAnkner
Zack Ankner
2 months
@bilaltwovec Maybe the real imagination was the next tokens we predicted along the way
1
0
2
@ZackAnkner
Zack Ankner
2 months
I'm at Neurips so if you want to talk about reward models with language feedback (, scalable oversight, control schemes, scaling laws for precision (, or anything else send me a dm 😀 Should also add I'm on the job market next year
0
0
24
@ZackAnkner
Zack Ankner
2 months
RT @rajammanabrolu: I'm not at #NeurIPS2024 this week but my students and collaborators @isadorcw @ZackAnkner @seungonekim will all be ther…
0
3
0
@ZackAnkner
Zack Ankner
2 months
Gonna start nap maxxing. Aiming for > 1 RCPS (rem cycles per second)
0
0
8
@ZackAnkner
Zack Ankner
2 months
Cool to see someone extending CLoud to not requiring oracle critiques! Its awesome that the idea of critique based reward models is spreading, and learning to self-generate the critiques is a really important step. Congrats Yue Yu , @magpie_rayhou, et al!
1
0
9
@ZackAnkner
Zack Ankner
3 months
RT @SkyLi0n: I am on the job market for industry and academic roles. My research focuses identifying, designing, and building efficient, s…
0
12
0