Dong Zhang Profile
Dong Zhang

@dongzha35524835

Followers
521
Following
612
Statuses
64

MS Student at FudanNLP Lab @FudanUniv | Developing SpeechGPT-Series

Joined September 2022
Don't wanna be here? Send us removal request.
@dongzha35524835
Dong Zhang
16 days
πŸ’₯ Introducing SpeechGPT 2.0-preview: A GPT-4O-level, real-time spoken dialogue system! (Currently supporting Chinese only, English will be soon.) πŸŽ† Highlights: Real-time speech-to-speech dialogue with latency under 200ms Rich in emotion and diverse in style, with strong speech style generalization Strong role-playing capabilities πŸ€–οΈ Try it out: Online system: Github: More demos:
6
29
134
@dongzha35524835
Dong Zhang
15 days
RT @Open_MOSS: πŸ₯³ Introducing SpeechGPT 2.0-preview: A GPT-4o-level, real-time spoken dialogue system! (Only Chinese for now) πŸŽ† Highlights:…
0
10
0
@dongzha35524835
Dong Zhang
16 days
We introduce a semantic-acoustic joint modeling ultra-low bitrate streaming speech codec and Codec-Patchify based speech-text LLM architecture, which is proved effective to reduce the modality gap between speech and text sequences. Through the experimental process, we also observed many interesting phenomena and conclusions. For example, through extensive pre-training on speech-text alignment, we found that the model could "emerge" with the ability to generalize speech styles. This includes controlling speech rate even without training on dialogue data with explicit speech rate adjustments, and adopting tones and styles of characters that the model had never seen before.
Tweet media one
Tweet media two
0
0
14
@dongzha35524835
Dong Zhang
5 months
Happy to share that our SpeechAlign, which applies RLHF to speech LLM, has been accepted by #NeurIPS2024. Many voice agents have emerged recently, but almost none of them consider SpeechLLM post-training. Let’s explore more in this direction! Arxiv:
Tweet media one
6
15
115
@dongzha35524835
Dong Zhang
5 months
@hingeloss @andersonbcdefg We conducted some analysis in out SpeechTokenizer and SpeechGPT-Gen paper. You can also refer to
0
0
1
@dongzha35524835
Dong Zhang
5 months
Thrilled to see Moshi draw inspiration from our SpeechTokenizer and SpeechGPT!πŸ˜€ Honored to contribute to advancing the spoken dialogue field!😊 Check more about our works about end2end spoken dialouge on
@kyutai_labs
kyutai
5 months
Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in Pytorch, Rust and MLX. More details below 🧡 ⬇️ Paper: Repo: HuggingFace:
0
0
34
@dongzha35524835
Dong Zhang
6 months
Off to Bangkok to attend #ACL2024. Glad to have a chat w/ folks interested in End2end speech2speech dialogue chatbot.
0
0
26
@dongzha35524835
Dong Zhang
6 months
@jeon_haesung Hopefully in Sept. or Oct.
0
0
3
@dongzha35524835
Dong Zhang
6 months
@cs_lisheng Thanks! I’ll not attend Interspeech, but I’ll attend ACL in Thailand.
0
0
1
@dongzha35524835
Dong Zhang
7 months
Funny
0
0
0