Shichao Song @Ki_Seki_here profile

Shichao Song

@Ki_Seki_here

Followers

91

Following

619

Statuses

231

Focus on LLM | CS PhD Student @ RUC | Research Intern @ IAAR | Volunteer @ AI TIME

Beijing, China

Joined August 2021

Don't wanna be here? Send us removal request.

Shichao Song

@Ki_Seki_here

5 months

Why is inference-time scaling crucial, as @OpenAI o1 shows? LLMs learn world knowledge, but naive prompting turns them into just high-level QA databases, losing consistency with their learned knowledge. We need models to incorporate Self-Feedback like o1! Let's dive in! 1/11

2

17

66

Shichao Song

@Ki_Seki_here

5 days

Love this

Visual Studio Code

@code

5 days

With the new no-config debug feature, you are now able to debug a Python script or module without setup, right from the terminal.

0

Shichao Song

@Ki_Seki_here

14 days

It's crazy, everyone, and it's even during the Spring Festival.

Qwen

@Alibaba_Qwen

14 days

The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive performance against the top-tier models, and outcompetes DeepSeek V3 in benchmarks like Arena Hard, LiveBench, LiveCodeBench, GPQA-Diamond. 📖 Blog: 💬 Qwen Chat: (choose Qwen2.5-Max as the model) ⚙️ API: （check the code snippet in the blog） 💻 HF Demo: In the future, we not only continue the scaling in pretraining, but also invest in the scaling in RL. We hope that Qwen is able to explore the unknown in the near future! 🔥 💗 Thank you for your support during the past year. See you next year!

0

Shichao Song

@Ki_Seki_here

1 month

I love innovation like this. I don't like competition.

Jon-Paul

@jon

1 month

Introducing.... GalaxyBrain. A powerful JSON-based information operating system. Please check it out, and let me know what you think.

0

1

Shichao Song

@Ki_Seki_here

1 month

RT @RongshengWang: Recommended Interesting Work: "HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs" Paper link:

0

1

0

Shichao Song

@Ki_Seki_here

1 month

RT @AdinaYakup: YuLan-Mini 💐 a 2.4B model delivering good performance with just 1.08T tokens by @RenminUniv And they shared all the traini…

0

36

0

Shichao Song

@Ki_Seki_here

1 month

Thank you to all my colleagues. @aakas888 @Hanyu_Wang419 @zhgyqc_duguce @fan2goa1 @RucDany @immazzystar 🤗

0

1

Shichao Song

@Ki_Seki_here

2 months

RT @gm8xx8: YuLan-Mini: A 2.42B-parameter model that punches above its weight. > Data Pipelines: Combines data cleaning and scheduling for…

0

16

0

Shichao Song

@Ki_Seki_here

2 months

@tydsh @kchonyc Resource-rich people solve problems more quickly and generate high returns more quickly, whereas the other way around is not.

0

Shichao Song

@Ki_Seki_here

2 months

@mrsiipa It's so real!🤦‍♂️

0

Shichao Song

@Ki_Seki_here

2 months

@_jasonwei ... so that we can debug, which is the foundation of AI security. Ref:

0

2

Shichao Song

@Ki_Seki_here

2 months

@michalwols @jbhuang0604 Great collection.

0

Shichao Song

@Ki_Seki_here

2 months

That's so real lol. My girlfriend's attitude to chat is very friendly. 😂

Meme For Programmers || Mathematicians

@MathRestaurant

2 months

0

Shichao Song

@Ki_Seki_here

2 months

Once an issue is submitted, a new data point will be added to Hugging Face datasets.

0

1

Shichao Song

@Ki_Seki_here

2 months

@NielKlug @Shoejoe_

0

1

Shichao Song

@Ki_Seki_here

2 months

RT @shelwin_: 🚀I'm releasing Jules, a proof of concept, open-source AI LaTeX Editor. Jules comes with cursor-like ⌘ K for AI Edits and La…

0

25

0