Dong Zhang @dongzha35524835 profile

Dong Zhang

@dongzha35524835

Followers

521

Following

612

Statuses

64

MS Student at FudanNLP Lab @FudanUniv | Developing SpeechGPT-Series

Joined September 2022

Don't wanna be here? Send us removal request.

Dong Zhang

@dongzha35524835

16 days

💥 Introducing SpeechGPT 2.0-preview: A GPT-4O-level, real-time spoken dialogue system! (Currently supporting Chinese only, English will be soon.) 🎆 Highlights: Real-time speech-to-speech dialogue with latency under 200ms Rich in emotion and diverse in style, with strong speech style generalization Strong role-playing capabilities 🤖️ Try it out: Online system: Github: More demos:

6

29

134

Dong Zhang

@dongzha35524835

15 days

RT @Open_MOSS: 🥳 Introducing SpeechGPT 2.0-preview: A GPT-4o-level, real-time spoken dialogue system! (Only Chinese for now) 🎆 Highlights:…

0

10

0

Dong Zhang

@dongzha35524835

16 days

We introduce a semantic-acoustic joint modeling ultra-low bitrate streaming speech codec and Codec-Patchify based speech-text LLM architecture, which is proved effective to reduce the modality gap between speech and text sequences. Through the experimental process, we also observed many interesting phenomena and conclusions. For example, through extensive pre-training on speech-text alignment, we found that the model could "emerge" with the ability to generalize speech styles. This includes controlling speech rate even without training on dialogue data with explicit speech rate adjustments, and adopting tones and styles of characters that the model had never seen before.

0

14

Dong Zhang

@dongzha35524835

5 months

Happy to share that our SpeechAlign, which applies RLHF to speech LLM, has been accepted by #NeurIPS2024. Many voice agents have emerged recently, but almost none of them consider SpeechLLM post-training. Let’s explore more in this direction! Arxiv:

6

15

115

Dong Zhang

@dongzha35524835

5 months

@hingeloss @andersonbcdefg We conducted some analysis in out SpeechTokenizer and SpeechGPT-Gen paper. You can also refer to

0

1

Dong Zhang

@dongzha35524835

5 months

Thrilled to see Moshi draw inspiration from our SpeechTokenizer and SpeechGPT!😀 Honored to contribute to advancing the spoken dialogue field!😊 Check more about our works about end2end spoken dialouge on

kyutai

@kyutai_labs

5 months

Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in Pytorch, Rust and MLX. More details below 🧵 ⬇️ Paper: Repo: HuggingFace:

0

34

Dong Zhang

@dongzha35524835

6 months

Off to Bangkok to attend #ACL2024. Glad to have a chat w/ folks interested in End2end speech2speech dialogue chatbot.

0

26

Dong Zhang

@dongzha35524835

6 months

@jeon_haesung Hopefully in Sept. or Oct.

0

3

Dong Zhang

@dongzha35524835

6 months

@cs_lisheng Thanks! I’ll not attend Interspeech, but I’ll attend ACL in Thailand.

0

1

Dong Zhang

@dongzha35524835

7 months

Funny

0