Ning Ding @stingning profile

Ning Ding

@stingning

Followers

1K

Following

660

Statuses

221

Researcher

Joined May 2015

Don't wanna be here? Send us removal request.

Ning Ding

@stingning

2 days

RT @JiaLi52524397: 🚀 NuminaMath 1.5 is here! 🚀 900k+ high-quality competition math problems with CoT solutions, new problem metadata, manua…

0

68

0

Ning Ding

@stingning

4 days

RT @lindsayttsq: 📝We've released the MedXpertQA dataset! 📚Check out more details: Preprint:

0

2

0

Ning Ding

@stingning

5 days

RT @yang_zonghan: Timely reminder for me to be grateful of how fortunate I am to have started my research journey from NLP, and what an hon…

0

1

0

Ning Ding

@stingning

8 days

RT @lifan__yuan: 1/ PRIME is alive on arXiv💡! Building on our blog, we've added extensive experiments exploring: - Implicit PRM design ch…

0

14

0

Ning Ding

@stingning

8 days

RT @stingning: 📜 We are releasing the PRIME paper: Let's be clear at first, dense rewards are not dead. And they…

0

15

0

Ning Ding

@stingning

8 days

RT @PhysInHistory: Euler's identity combines these five numbers in a simple and elegant equation. Despite each of the constants representin…

0

201

0

Ning Ding

@stingning

8 days

GitHub:

0

3

Ning Ding

@stingning

9 days

RT @_akhaliq: Process Reinforcement through Implicit Rewards

0

19

0

Ning Ding

@stingning

9 days

RT @lindsayttsq: We improve clinical relevance through ⭐️Medical specialty coverage: MedXpertQA includes questions from 20+ exams of medica…

0

1

0

Ning Ding

@stingning

9 days

"Reasoning" encompasses much more than just mathematics and coding.

Shang Qu

@lindsayttsq

9 days

📈How far are leading models from mastering realistic medical tasks? MedXpertQA, our new text & multimodal medical benchmark, reveals existing gaps in model abilities. Compared with rapidly saturating benchmarks like MedQA, we raise the bar with harder questions and a sharper focus on medical reasoning. 📌Percentage scores on our Text subset: o3-mini: 37.30 R1: 37.76 - the clear frontrunner among open-source models o1: 44.67 - highest performance, but still much room for improvement! Preprint: Data files will be released shortly at: Key insights in 🧵

0

9

Ning Ding

@stingning

9 days

RT @lindsayttsq: 📈How far are leading models from mastering realistic medical tasks? MedXpertQA, our new text & multimodal medical benchmar…

0

6

0

Ning Ding

@stingning

16 days

In 2023, I noticed that DeepSeek released advertisements recruiting a group of "Data Virtuosos." In their job requirements, they mentioned hoping to find individuals with broad knowledge, proficient in literature, history, culture, science, anime, films, and more—quick on their feet and full of imagination—to help DeepSeek build its own data moat. The recruitment targeted people in mainland China. Although I'm not sure about the subsequent outcomes of this recruitment, it likely played a role. By the way, I also think DeepSeek's general reasoning ability in English is excellent.

1

7

Ning Ding

@stingning

17 days

RT @sama: fun watching people react to operator. reminds me of the chatgpt launch!

0

403

0