qinzytech Profile Banner
Zengyi Qin Profile
Zengyi Qin

@qinzytech

Followers
3K
Following
164
Statuses
103

MIT PhD @MIT | Hardcore GenAI Researcher | MyShell | Homepage: https://t.co/bwtUBzigZD

Boston, MA, USA
Joined December 2023
Don't wanna be here? Send us removal request.
@qinzytech
Zengyi Qin
10 months
Training LLMs can be much cheaper than previously thought. 0.1 million USD is sufficient for training LLaMA2-level LLMs🤯 While @OpenAI and @Meta use billions of dollars to train theirs, you can also train yours with much less money. Introducing our open-source project JetMoE: A thread 🧵
Tweet media one
53
170
897
@qinzytech
Zengyi Qin
10 days
@krishnakaasyap source is from Huawei employees. BTW in terms of FLOPS they already catched up. But the communication is still a little behind NVIDIA
1
0
1
@qinzytech
Zengyi Qin
17 days
@simone_m_romeo yes. after all this is a research preview
1
0
8
@qinzytech
Zengyi Qin
18 days
@TsingYoga And the DPO version’s weight seems broken. It outputs random words
0
0
1
@qinzytech
Zengyi Qin
18 days
@TsingYoga I see. That makes sense
0
0
0
@qinzytech
Zengyi Qin
24 days
@JoJrobotics @luisbrasroque they can reason but not generalizable enough
1
0
0
@qinzytech
Zengyi Qin
24 days
@Caffeinix_alche The tasks won't be too long but are sufficient to give o1 a 0 score
2
0
64
@qinzytech
Zengyi Qin
24 days
@srivatsamath We will release and open-source a model that significantly outperforms o1 in computer-use agents and release the benchmark at the same time. Stay tuned
4
2
135
@qinzytech
Zengyi Qin
24 days
@gauranshsoni also almost 0% because their pre-training data does not contain sufficient long-horizon interactive computer-use decision making data
1
1
49
@qinzytech
Zengyi Qin
29 days
RT @tom_doerr: MeloTTS: A text-to-speech library supporting English, Spanish, French, Chinese,Japanese, and Korean, with various accents an…
0
78
0
@qinzytech
Zengyi Qin
1 month
0
0
2
@qinzytech
Zengyi Qin
1 month
@davidbau Consider this one, which democratizes Large model training and make it accessible to many research labs. Website: Paper:
0
0
3
@qinzytech
Zengyi Qin
1 month
Many people think @xai's 100K GPU cluster is no longer necessary given @deepseek_ai's success with only 2K GPUs. That is not true. The fact is that compute is always limited. If you have 100K GPUs then you can do a lot of LARGE experiments very QUICKLY, then iterate the model very fast.
1
0
18
@qinzytech
Zengyi Qin
1 month
@Alibaba_Qwen Will definitely grab a coffee with you when I’m back in Hangzhou!
0
0
3
@qinzytech
Zengyi Qin
1 month
BTW here is a comparison between DeepSeekMoE and JetMoE with similar parameter count. See Table 3 in this screenshot
Tweet media one
0
0
4
@qinzytech
Zengyi Qin
1 month
@ZiqiPang Neither one. We should instead train an agentic one - it should do some bold/risky stuff that big companies like OpenAI won't release due to safety issues
3
0
15