Rosmine @rosmine_b profile

Rosmine

@rosmine_b

Followers

1K

Following

2K

Statuses

697

Senior ML Scientist @ FAANG working on LLMs DM me your ML questions

Joined October 2023

Don't wanna be here? Send us removal request.

Rosmine

@rosmine_b

8 months

I'm starting a "research in public" project on solving 20 Questions Games with LLMs and have some cool stuff to share! 🧵

2

1

28

Rosmine

@rosmine_b

20 hours

Cool prompt idea: "Teach me something about <topic> that I don't already know. Something niche and obscure." But it turns out this is basically saying "hello llm, please hallucinate for me" Even when I asked it to cite a paper and include a quotation, it would make up a paper.

0

3

Rosmine

@rosmine_b

1 day

It's going to be so funny if we get AGI, but the only way to do it is test time compute scaling that makes it more expensive than humans

1

0

6

Rosmine

@rosmine_b

4 days

@Kylec1215 @TheXeophon It took 8 minutes

2

0

2

Rosmine

@rosmine_b

4 days

@Kylec1215 @TheXeophon Here you go!

1

0

Rosmine

@rosmine_b

4 days

@Kylec1215 @TheXeophon I like to use it for lit reviews and checking if projects are already done. Anything that requires searching through large amounts of information. Here's an example lit review I've also used it for health topics, e.g. research on the best way to exercise

5

0

Rosmine

@rosmine_b

5 days

@Kylec1215 @TheXeophon Sure!

1

0

1

Rosmine

@rosmine_b

5 days

@cccntu For comparison, curriculum learning (SFT with samples ordered by difficulty) also often leads to better generalization

0

7

Rosmine

@rosmine_b

5 days

@tensorqt Fun fact, the probability of that is 10^{-3.6x10^8}. It's so small you need exponents within the exponential notation. Anthropic really does have the mandate of heaven.

Nora Belrose

@norabelrose

8 days

What are the chances you'd get a fully functional language model by randomly guessing the weights? We crunched the numbers and here's the answer:

0

5

Rosmine

@rosmine_b

5 days

@iScienceLuvr From Karpathy himself: "It took me last ~6 weeks to get a from-scratch policy gradients implementation to work 50% of the time on a bunch of RL problems." source:

0

10

Rosmine

@rosmine_b

6 days

@OpenAI Feature request: In the sidebar, can we have a way to filter Deep Research chats vs. other chats? I want to return to Deep Research queries, but sometimes have trouble finding them in all the other chats

2

0

7

Rosmine

@rosmine_b

7 days

@nrehiew_ Just posted this in response to another rl post:

Rosmine

@rosmine_b

7 days

@andersonbcdefg RL is the the automation of reward hacking

0

1

Rosmine

@rosmine_b

7 days

@nrehiew_ accuracy weighted too low so the output is just <answer> random int </answer> ?

1

0

6

Rosmine

@rosmine_b

7 days

@andersonbcdefg RL is the the automation of reward hacking

0

Rosmine

@rosmine_b

7 days

@TheXeophon @andrew_n_carr The key to AGI is giving the models imposter syndrome 😂

0

1

Rosmine

@rosmine_b

7 days

@I_loves_deep_nn Unhappy people send hate because they want other people to be on their level, and it's easier to make other people sad than make themselves happier. Fight back by emitting positivity!

0

1

Rosmine

@rosmine_b

7 days

@speedymachine1 Yep, I'm in the US, most of the ones here require a prescription

1

0

1

Rosmine

@rosmine_b

8 days

@IlyasHairline @abacaj This paper got test time scaling just by adding "wait" repeatedly

1

0

46

Rosmine

@rosmine_b

8 days

@abacaj Basically every LLM reasoning result looks so obvious in retrospect. Prompt engineering, CoT, STaR, Self-Consistency. Every paper your read is "why didn't I think of that"

8

178

Rosmine

@rosmine_b

8 days

@Aizkmusic I wanted to try and bought a pro subscription. It's really useful. lmk if you have a query you want to try

0

1