(寻找实习) 萨尔曼 // Salman @ForBo7_ profile

(寻找实习) 萨尔曼 // Salman

@ForBo7_

Followers

91

Following

298

Statuses

639

• Studying BSc(Hons)AI&EdTech • Researching Embodied AI@EdUHK • fastai student • 自学中文 // Self-learning Chines • Sharing what I learn • Been to 15 countries

China

Joined September 2022

Don't wanna be here? Send us removal request.

(寻找实习) 萨尔曼 // Salman

@ForBo7_

5 months

Doing lesson 15 of the @fastdotai course; deducing how to rearrange convolutions as a matrix product

1

5

(寻找实习) 萨尔曼 // Salman

@ForBo7_

6 hours

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

7 hours

Correction, I think: Whilst R1-Zero is pure RL, R1 is not: SFT is involved.

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

Reading through the R1 paper. From my understanding, in a nutshell: - R1-Zero is pure RL, with GRPO as the policy - R1 is pure RL, with GRPO as the policy, with some cold start data, and further refinement stages

0

1

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

@chris_j_paxton @ChongZitaZhang It does make sense when you really think about it

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

RT @ChongZitaZhang: Motions so efficient, no human reference needed

0

48

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

Reading through the R1 paper. From my understanding, in a nutshell: - R1-Zero is pure RL, with GRPO as the policy - R1 is pure RL, with GRPO as the policy, with some cold start data, and further refinement stages

0

1

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

For generating novel output, that is

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

Has there been any work done in introducing some sort of entropy/variation/randomness in the process of generating an LLM output?

0

1

(寻找实习) 萨尔曼 // Salman

@ForBo7_

1 day

Has there been any work done in introducing some sort of entropy/variation/randomness in the process of generating an LLM output?

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

2 days

@abhi1thakur - OpenRouter - TogetherAI - LambdaLabs - FireworksAI a few quick ones off the top of my head

0

1

(寻找实习) 萨尔曼 // Salman

@ForBo7_

4 days

I'm thinking perhaps that it would be the case for novel/extrapolated tasks, rather than the opposite.

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

4 days

@jeremyphoward What ideas do you have in mind for alternative approaches?

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

5 days

RT @WevolverApp: Researchers at the University of Science and Technology of China have designed the octopus-inspired SpiRobs robotic arm. D…

0

78

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

7 days

RT @UnslothAI: You can now reproduce DeepSeek-R1's reasoning on your own local device! Experience the "Aha" moment with just 7GB VRAM. Un…

0

539

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

8 days

I think the best way to understand the 把 particle is the following example: - 我放书在桌子上 - I put the book on the table - 我把书放在桌子上。 - The book has been put on the table by me Following from the previous example... - 把书放桌子上 - The book is placed on the table

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

9 days

Putonghua abilities I need to improve on: - Emphasizing the 4th tone - Differentiating between the 2nd and 3rd tones - Paying closer attention to tones in various audio

0

1

(寻找实习) 萨尔曼 // Salman

@ForBo7_

9 days

和 can be used as a drop in replacement for 跟. The only difference is that in certain connotations, 跟 has a sequentially connotion to it. You can't say 和我读, but you can say 跟我读.

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

10 days

Then on top of that, the benchmarks have very, very few tasks: like 10–20. It's not representative.

0

(寻找实习) 萨尔曼 // Salman

@ForBo7_

10 days

RT @iScienceLuvr: I have been thinking what a DeepSeek R1 for medicine would would like... But this paper already kinda did it? plus code…

0

73

0