LiuHaohe Profile Banner
Haohe Liu Profile
Haohe Liu

@LiuHaohe

Followers
2K
Following
1K
Statuses
234

PhD student at @cvssp_research, @UniOfSurrey, UK, working with @markplumbley. Ex-intern at @Meta @MicrosoftASIA & @BytedanceTalk. https://t.co/qmZe2lvGHX

Joined October 2021
Don't wanna be here? Send us removal request.
@LiuHaohe
Haohe Liu
2 years
AudioLDM (ICML 2023) has officially joined the 🧨 Diffusers library! With up to 5x faster speeds and 3 different model sizes. Huge thanks to @sanchitgandhi99 for making it happen! 🥳 HF space has also gotten a significant speed boost! Get started now!
7
36
237
@LiuHaohe
Haohe Liu
5 days
RT @Tu7uruu: Pushing Llasa to the Italian world! 🍕 Llasagna v0.1 1b is a italian text-to-speech based on Llasa-1b! Not perfect by still wor…
0
3
0
@LiuHaohe
Haohe Liu
5 days
RT @ArxivSound: ``Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis,'' Zhen Ye, Xinfa Zhu, Chi-Min Chan…
0
8
0
@LiuHaohe
Haohe Liu
5 days
RT @_akhaliq: Llasa Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis
Tweet media one
0
22
0
@LiuHaohe
Haohe Liu
14 days
RT @LiuXub: Excited to share that our paper 'Separate Anything You Describe,' has been published in IEEE Transactions on Audio, Speech, and…
0
5
0
@LiuHaohe
Haohe Liu
15 days
RT @reach_vb: HOLY SHITT: Open Suno is here! You can generate full songs with a 7B parameter model! 🔥 You can decide on background musics,…
0
131
0
@LiuHaohe
Haohe Liu
15 days
@m4zas24 @reach_vb Hi @m4zas24 Everything is available here It is a model dedicated to music composition.
0
0
0
@LiuHaohe
Haohe Liu
15 days
One-step diffusion-based audio super-resolution! FlashSR would make scaling up much easier. Excited to see this as a great follow-up to AudioSR.
@osalooloo
Jaekwon Im
16 days
🌟 Excited to announce the release of the code and model weights for FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation, accepted at ICASSP 2025! 🎉 🔗 Check out the demo, code, and paper here:
Tweet media one
0
2
9
@LiuHaohe
Haohe Liu
18 days
RT @reach_vb: HOLY SHITT! Llasa TTS - Llama 3.2 fine-tune with ultra realistic audio 🔥 > supports voice cloning in English + Chinese > tr…
0
285
0
@LiuHaohe
Haohe Liu
23 days
RT @QiuqiangK: A codebase for learning TTS/music_generation with LLM in 1 hour:
0
5
0
@LiuHaohe
Haohe Liu
2 months
RT @LiuXub: Code and model weights have been released! 🔥 GitHub:
0
5
0
@LiuHaohe
Haohe Liu
2 months
New 6M Audio-Caption Paired Dataset! Great work lead by @JishengBai
@JishengBai
白吉生
2 months
Excited to share our latest work: AudioSetCaps — the largest audio-caption dataset to date, with 6 million audio-caption pairs (fully open-sourced)! 🔗 ArXiv Paper: 🔗 GitHub: 🔗 HF:
Tweet media one
0
0
3
@LiuHaohe
Haohe Liu
2 months
RT @JishengBai: Excited to share our latest work: AudioSetCaps — the largest audio-caption dataset to date, with 6 million audio-caption pa…
0
2
0
@LiuHaohe
Haohe Liu
2 months
RT @JishengBai: This paper is an extension of our NeurIPS 2024 Audio Imagination Workshop paper (, where we discuss…
0
2
0
@LiuHaohe
Haohe Liu
3 months
The SemantiCodec paper has been accepted by the IEEE Journal of Selected Topics in Signal Processing Huge thanks to all the contributors and reviewers!
@LiuHaohe
Haohe Liu
10 months
Excited to introduce SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound 🎉 SemantiCodec (50 tokens/second or 0.71kbps) ≈ Previous methods (200 tokens/second or 2.0kbps). 🎉Our study also reveals that SemantiCodec tokens hold richer semantic information.
Tweet media one
Tweet media two
0
1
21
@LiuHaohe
Haohe Liu
4 months
Had an awesome day at Télécom Paris @tp_adasp! Great discussions with many amazing researchers!
@michelolzam
Michel Olvera
4 months
Great talk today by @LiuHaohe at the @tp_adasp group on Latent Diffusion Models (LDMs) as versatile audio decoder! Walked us through diffusion basics, AudioLDM for text-to-audio, audio quality enhancement, and neural codecs!
Tweet media one
2
1
30
@LiuHaohe
Haohe Liu
5 months
It's nice to see SemantiCodec taking the lead in audio codec semantic information benchmarking, even at a much lower bitrate. Paper result cross-validated! Great work! @realHaibinWu et al.
Tweet media one
@ArxivSound
arXiv Sound
5 months
``Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models,'' Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kaiwei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan…
0
3
18
@LiuHaohe
Haohe Liu
5 months
RT @honualx: Meet Moshiko and Moshika, the open source Moshi models 📖🟢. Moshi is a 7B text-audio model, capable of doing full-duplex conver…
0
136
0
@LiuHaohe
Haohe Liu
5 months
RT @LiuXub: I'm excited to introduce the Source-Disentangled Neural Audio Codec (SD-Codec), a new codec model that can disentangle arbitrar…
0
24
0
@LiuHaohe
Haohe Liu
5 months
RT @LiuXub: Excited to introduce FlowSep: a Rectified Flow Matching (RFM) based generative model for Language-Queried Sound Separation! L…
0
7
0