Robin San Roman Profile
Robin San Roman

@RobinSanroman

Followers
155
Following
983
Statuses
100

Phd student @Inria and @MetaAi - Working on AI for Audio.

Joined September 2014
Don't wanna be here? Send us removal request.
@RobinSanroman
Robin San Roman
1 month
Excited to share that our two papers were accepted at #ICASSP2025, both focus on Audio Language Models! 🗣️🎶 1/10
1
4
30
@RobinSanroman
Robin San Roman
8 days
RT @p_bojanowski: Another amazing use of DINOv2 out there. It is both humbling and heartwarming to see such hard problems being tackled in…
0
4
0
@RobinSanroman
Robin San Roman
1 month
Thanks to all coauthors @simonrouard @adiyossLC @pierrefdz @RSerizel @AntoineDelefor1 and Axel Roebel See you in Hyderabad ! 10/10
0
0
3
@RobinSanroman
Robin San Roman
1 month
This model performs text-to-music generation at a level comparable to the standard MusicGen model. It also supports text-guided music editing, such as generating stems that complement existing ones 8/10
1
0
0
@RobinSanroman
Robin San Roman
1 month
Our second paper, MusicGen-Stem: Multi-stem Music Generation and Edition through Autoregressive Modeling, enhances MusicGen by enabling music generation at the stem level. It utilizes three codecs for the supported stems: bass, drums, and other. 7/10
Tweet media one
1
0
5
@RobinSanroman
Robin San Roman
1 month
This approach is crucial for scenarios where you want to open-source model weights without compromising on safety. full paper: 6/10
1
0
2
@RobinSanroman
Robin San Roman
1 month
However, the watermark can be fine-tuned out of the model by training it with non-watermarked data. We show that this comes at the cost of reduced performance, akin to a model not trained without the original (watermarked) dataset. 5/10
Tweet media one
1
0
1
@RobinSanroman
Robin San Roman
1 month
Moreover, the outputs of the LM are detectable with high confidence, even when using different codec decoders. This ensures the traceability of generated content even for advanced users. 4/10
Tweet media one
1
0
1
@RobinSanroman
Robin San Roman
1 month
We builds on AudioSeal to develop a watermarking method robust to EnCodec at 32kHz. We demonstrate that a language model trained on watermarked data achieves performance comparable to standard training. 3/10
Tweet media one
1
0
1
@RobinSanroman
Robin San Roman
1 month
In our first paper "Latent Watermarking of Audio Generative Models" we develop a method that allows to have in-weights watermarking for Audio Language model resulting in detectable outputs. 2/10
Tweet media one
1
1
4
@RobinSanroman
Robin San Roman
3 months
RT @howariou: I'll present Stem-JEPA this afternoon at @ISMIRConf! Hope to see you there!
0
4
0
@RobinSanroman
Robin San Roman
4 months
@BilouteFUT @albertonger @ActuFoot_ @FabriceHawkins @marca Regarde les vainqueurs, visiblement il peut y avoir un facteur bloquant pour certaines personnes...
1
0
2
@RobinSanroman
Robin San Roman
4 months
@ActuFoot_ @FabriceHawkins @marca 1. Idéologie postulant une hiérarchie des races. 2. Discrimination, hostilité violente envers un groupe humain.
0
0
0
@RobinSanroman
Robin San Roman
4 months
RT @_Vassim: Want to know if a ML model was trained on your dataset? Introducing ✨Data Taggants✨! We use data poisoning to leave a harmless…
0
20
0
@RobinSanroman
Robin San Roman
4 months
RT @JoaoMJaneiro: We have released the training code and weights for our MEXMA model ( 🎉. Training code and model:…
0
1
0
@RobinSanroman
Robin San Roman
5 months
RT @b_alastruey: 🚨New #EMNLP Main paper🚨 What is the impact of ASR pretraining in Direct Speech Translation models?🤔 In our work we use…
0
4
0
@RobinSanroman
Robin San Roman
5 months
RT @JoaoMJaneiro: Last week we released the first paper of my PhD, "MEXMA: Token-level objectives improve sentence representations". We…
0
8
0
@RobinSanroman
Robin San Roman
7 months
RT @howariou: Glad to announce that my paper has been accepted to @ISMIRConf !!! 🥳🥳🥳 More info in this listening test that the reviewers a…
0
5
0