Zack Ankner @ZackAnkner profile

Zack Ankner

@ZackAnkner

Followers

1K

Following

985

Statuses

343

Senior @MIT. President of AI@MIT. Research Scientist Intern @DbrxMosaicAI.

Joined September 2019

Don't wanna be here? Send us removal request.

Zack Ankner

@ZackAnkner

6 months

Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly!

13

57

249

Zack Ankner

@ZackAnkner

13 days

If I got aspartame poisoning, I wouldn’t tell anyone but there would be signs

Zack Ankner

@ZackAnkner

17 days

2 days, 18 cokes, 216 fluid ounces later, the second layer has been assembled .

0

14

Zack Ankner

@ZackAnkner

16 days

If a turing machine writes to tape and no one is around to read the tape, is it really turing complete

1

0

6

Zack Ankner

@ZackAnkner

17 days

2 days, 18 cokes, 216 fluid ounces later, the second layer has been assembled .

Zack Ankner

@ZackAnkner

19 days

Aspartame maxxing

0

1

22

Zack Ankner

@ZackAnkner

18 days

RT @rajammanabrolu: V cool to see that Kimi has taken and scaled our CLoud paper to do better reward modeling through extra inference time…

0

3

0

Zack Ankner

@ZackAnkner

18 days

If you're also sick of quiet reward models can check out the code here:

0

2

Zack Ankner

@ZackAnkner

19 days

Aspartame maxxing

0

13

Zack Ankner

@ZackAnkner

20 days

@code_star Line go up, line go down, doesn't matter. I'm sitting there staring.

0

3

Zack Ankner

@ZackAnkner

1 month

Are there any good writings on how people will purposefully misuse AGI. Something along the lines of the first thing a government that has AGI would do is use it against a foreign power. Trying to think about safety cases when models will purposely be used for misaligned tasks.

1

0

2

Zack Ankner

@ZackAnkner

1 month

Interesting question is how much is sequential prediction needed vs parallel prediction like Meta ( . Also some interesting schemes for very future token prediction, like predicting the average hidden states over some chunk of tokens very far in the future.

0

3

Zack Ankner

@ZackAnkner

2 months

@bilaltwovec Maybe the real imagination was the next tokens we predicted along the way

1

0

2

Zack Ankner

@ZackAnkner

2 months

I'm at Neurips so if you want to talk about reward models with language feedback (, scalable oversight, control schemes, scaling laws for precision (, or anything else send me a dm 😀 Should also add I'm on the job market next year

0

24

Zack Ankner

@ZackAnkner

2 months

RT @rajammanabrolu: I'm not at #NeurIPS2024 this week but my students and collaborators @isadorcw @ZackAnkner @seungonekim will all be ther…

0

3

0

Zack Ankner

@ZackAnkner

2 months

Gonna start nap maxxing. Aiming for > 1 RCPS (rem cycles per second)

0

8

Zack Ankner

@ZackAnkner

2 months

Cool to see someone extending CLoud to not requiring oracle critiques! Its awesome that the idea of critique based reward models is spreading, and learning to self-generate the critiques is a really important step. Congrats Yue Yu , @magpie_rayhou, et al!

1

0

9

Zack Ankner

@ZackAnkner

3 months

RT @SkyLi0n: I am on the job market for industry and academic roles. My research focuses identifying, designing, and building efficient, s…

0

12

0