arXiv Sound @ArxivSound profile

arXiv Sound

@ArxivSound

Followers

5K

Following

1

Statuses

14K

Sound-related articles (https://t.co/dxVYgWJGOw and https://t.co/b90N0Zzvjs) on https://t.co/HHqPequzVU

Joined July 2020

Don't wanna be here? Send us removal request.

arXiv Sound

@ArxivSound

2 years

[IMPORTANT] arXiv sound does not post some papers submitted to arXiv or This is because they do not appear in the RSS of arXiv. We apologize for your inconvenience.

1

0

6

arXiv Sound

@ArxivSound

7 hours

``Hookpad Aria: A Copilot for Songwriters,'' Chris Donahue, Shih-Lun Wu, Yewon Kim, Dave Carlton, Ryan Miyakawa, John Thickstun,

0

2

arXiv Sound

@ArxivSound

7 hours

``Methods for pitch analysis in contemporary popular music: highlighting pitch uncertainty in Primaal's commercial works,'' Emmanuel Deruty, Luc Leroy, Yann Mac\'e, David Meredith,

0

1

arXiv Sound

@ArxivSound

7 hours

``DualStream Contextual Fusion Network: Efficient Target Speaker Extraction by Leveraging Mixture and Enrollment Interactions,'' Ke Xue, Rongfei Fan, Shanping Yu, Chang Sun, Jianping An,

0

4

arXiv Sound

@ArxivSound

7 hours

``Sparse wavefield reconstruction and denoising with boostlets,'' Elias Zea, Marco Laudato, Joakim and\'en,

0

2

arXiv Sound

@ArxivSound

7 hours

``EnvId: A Metric Learning Approach for Forensic Few-Shot Identification of Unseen Environments,'' Denise Moussa, Germans Hirsch, Christian Riess,

0

1

arXiv Sound

@ArxivSound

7 hours

``Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models,'' Anna Derington, Hagen Wierstorf, Ali \"Ozkil, Florian Eyben, Felix Burkhardt, Bj\"orn W. Schuller,

0

4

arXiv Sound

@ArxivSound

7 hours

``Janssen 2.0: Audio Inpainting in the Time-frequency Domain,'' Ond\v{r}ej Mokr\'y, Peter Balu\v{s}\'ik, Pavel Rajmic,

0

1

7

arXiv Sound

@ArxivSound

7 hours

``Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment,'' Zuyan Liu, Yuhao Dong, Jiahui Wang, Ziwei Liu, Winston Hu, Jiwen Lu, Yongming Rao,

0

1

arXiv Sound

@ArxivSound

9 hours

``Music for All: Exploring Multicultural Representations in Music Generation Models,'' Atharva Mehta, Shivam Chauhan, Amirbek Djanibekov, Atharva Kulkarni, Gus Xia, Monojit Choudhury,

0

5

12

arXiv Sound

@ArxivSound

1 day

``A Hybrid Model for Weakly-Supervised Speech Dereverberation,'' Louis Bahrman (S2A, IDS), Mathieu Fontaine (S2A, IDS), Gael Richard (S2A, IDS),

0

1

arXiv Sound

@ArxivSound

1 day

``VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR Identification,'' Pengyu Wang, Ying Fang, Xiaofei Li,

0

1

arXiv Sound

@ArxivSound

1 day

``Towards Understanding of Frequency Dependence on Sound Event Detection,'' Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Byeong-Yun Ko, Yong-Hwa Park,

0

2

arXiv Sound

@ArxivSound

1 day

``Towards Efficient and Multifaceted Computer-assisted Pronunciation Training Leveraging Hierarchical Selective State Space Model and Decoupled Cross-entropy Loss,'' Fu-An Chao, Berlin Chen,

0

1

arXiv Sound

@ArxivSound

1 day

``RenderBox: Expressive Performance Rendering with Text Control,'' Huan Zhang, Akira Maezawa, Simon Dixon,

0

2

arXiv Sound

@ArxivSound

1 day

``Adaptive Central Frequencies Locally Competitive Algorithm for Speech,'' Soufiyan Bahadi, Eric Plourde, Jean Rouat,

0

1

arXiv Sound

@ArxivSound

1 day

``Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment,'' Kwanghee Choi, Eunjung Yeo, Kalvin Chang, Shinji Watanabe, David Mortensen,

0

1

arXiv Sound

@ArxivSound

1 day

``Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction,'' Leying Zhang, Wangyou Zhang, Zhengyang Chen, Yanmin Qian,

0

2

arXiv Sound

@ArxivSound

1 day

``LoRP-TTS: Low-Rank Personalized Text-To-Speech,'' {\L}ukasz Bondaruk, Jakub Kubiak,

0

2

arXiv Sound

@ArxivSound

1 day

``MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events,'' Xiaoyu Yang, Qiujia Li, Chao Zhang, Phil Woodland,

0

arXiv Sound

@ArxivSound

1 day

``mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition,'' Andrew Rouditchenko, Samuel Thomas, Hilde Kuehne, Rogerio Feris, James Glass,

0

5