Haohe Liu @LiuHaohe profile

Haohe Liu

@LiuHaohe

Followers

2K

Following

1K

Statuses

234

PhD student at @cvssp_research, @UniOfSurrey, UK, working with @markplumbley. Ex-intern at @Meta @MicrosoftASIA & @BytedanceTalk. https://t.co/qmZe2lvGHX

Joined October 2021

Don't wanna be here? Send us removal request.

Haohe Liu

@LiuHaohe

2 years

AudioLDM (ICML 2023) has officially joined the 🧨 Diffusers library! With up to 5x faster speeds and 3 different model sizes. Huge thanks to @sanchitgandhi99 for making it happen! 🥳 HF space has also gotten a significant speed boost! Get started now!

7

36

237

Haohe Liu

@LiuHaohe

5 days

RT @Tu7uruu: Pushing Llasa to the Italian world! 🍕 Llasagna v0.1 1b is a italian text-to-speech based on Llasa-1b! Not perfect by still wor…

0

3

0

Haohe Liu

@LiuHaohe

5 days

RT @ArxivSound: ``Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis,'' Zhen Ye, Xinfa Zhu, Chi-Min Chan…

0

8

0

Haohe Liu

@LiuHaohe

5 days

RT @_akhaliq: Llasa Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

0

22

0

Haohe Liu

@LiuHaohe

14 days

RT @LiuXub: Excited to share that our paper 'Separate Anything You Describe,' has been published in IEEE Transactions on Audio, Speech, and…

0

5

0

Haohe Liu

@LiuHaohe

15 days

RT @reach_vb: HOLY SHITT: Open Suno is here! You can generate full songs with a 7B parameter model! 🔥 You can decide on background musics,…

0

131

0

Haohe Liu

@LiuHaohe

15 days

@m4zas24 @reach_vb Hi @m4zas24 Everything is available here It is a model dedicated to music composition.

0

Haohe Liu

@LiuHaohe

15 days

One-step diffusion-based audio super-resolution! FlashSR would make scaling up much easier. Excited to see this as a great follow-up to AudioSR.

Jaekwon Im

@osalooloo

16 days

🌟 Excited to announce the release of the code and model weights for FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation, accepted at ICASSP 2025! 🎉 🔗 Check out the demo, code, and paper here:

0

2

9

Haohe Liu

@LiuHaohe

18 days

RT @reach_vb: HOLY SHITT! Llasa TTS - Llama 3.2 fine-tune with ultra realistic audio 🔥 > supports voice cloning in English + Chinese > tr…

0

285

0

Haohe Liu

@LiuHaohe

23 days

RT @QiuqiangK: A codebase for learning TTS/music_generation with LLM in 1 hour:

0

5

0

Haohe Liu

@LiuHaohe

2 months

RT @LiuXub: Code and model weights have been released! 🔥 GitHub:

0

5

0

Haohe Liu

@LiuHaohe

2 months

New 6M Audio-Caption Paired Dataset! Great work lead by @JishengBai

白吉生

@JishengBai

2 months

Excited to share our latest work: AudioSetCaps — the largest audio-caption dataset to date, with 6 million audio-caption pairs (fully open-sourced)! 🔗 ArXiv Paper: 🔗 GitHub: 🔗 HF:

0

3

Haohe Liu

@LiuHaohe

2 months

RT @JishengBai: Excited to share our latest work: AudioSetCaps — the largest audio-caption dataset to date, with 6 million audio-caption pa…

0

2

0

Haohe Liu

@LiuHaohe

2 months

RT @JishengBai: This paper is an extension of our NeurIPS 2024 Audio Imagination Workshop paper (, where we discuss…

0

2

0

Haohe Liu

@LiuHaohe

3 months

The SemantiCodec paper has been accepted by the IEEE Journal of Selected Topics in Signal Processing Huge thanks to all the contributors and reviewers!

Haohe Liu

@LiuHaohe

10 months

Excited to introduce SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound 🎉 SemantiCodec (50 tokens/second or 0.71kbps) ≈ Previous methods (200 tokens/second or 2.0kbps). 🎉Our study also reveals that SemantiCodec tokens hold richer semantic information.

0

1

21

Haohe Liu

@LiuHaohe

4 months

Had an awesome day at Télécom Paris @tp_adasp! Great discussions with many amazing researchers!

Michel Olvera

@michelolzam

4 months

Great talk today by @LiuHaohe at the @tp_adasp group on Latent Diffusion Models (LDMs) as versatile audio decoder! Walked us through diffusion basics, AudioLDM for text-to-audio, audio quality enhancement, and neural codecs!

2

1

30

Haohe Liu

@LiuHaohe

5 months

It's nice to see SemantiCodec taking the lead in audio codec semantic information benchmarking, even at a much lower bitrate. Paper result cross-validated! Great work! @realHaibinWu et al.

arXiv Sound

@ArxivSound

5 months

``Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models,'' Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kaiwei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan…

0

3

18

Haohe Liu

@LiuHaohe

5 months

RT @honualx: Meet Moshiko and Moshika, the open source Moshi models 📖🟢. Moshi is a 7B text-audio model, capable of doing full-duplex conver…

0

136

0

Haohe Liu

@LiuHaohe

5 months

RT @LiuXub: I'm excited to introduce the Source-Disentangled Neural Audio Codec (SD-Codec), a new codec model that can disentangle arbitrar…

0

24

0

Haohe Liu

@LiuHaohe

5 months

RT @LiuXub: Excited to introduce FlowSep: a Rectified Flow Matching (RFM) based generative model for Language-Queried Sound Separation! L…

0

7

0