rdesh26 Profile Banner
Desh Raj Profile
Desh Raj

@rdesh26

Followers
3K
Following
7K
Statuses
2K

Research Scientist @Meta GenAI | Previously: @jhuclsp, @IITGuwahati

New York, NY
Joined September 2009
Don't wanna be here? Send us removal request.
@rdesh26
Desh Raj
1 year
**Dissertation now available** 📜: 📽️: ⏯️: It's a 332-page tome, but I have summarized it in this thread 👇 1/n
@ArxivSound
arXiv Sound
1 year
``Listening to Multi-talker Conversations: Modular and End-to-end Perspectives,'' Desh Raj,
2
9
106
@rdesh26
Desh Raj
3 months
0
0
8
@rdesh26
Desh Raj
3 months
@shinjiw_at_cmu @chenwanch1 Congratulations! Very comprehensive paper and well deserved recognition :)
0
0
3
@rdesh26
Desh Raj
3 months
@giffmana Diwali bonanza
0
0
0
@rdesh26
Desh Raj
4 months
@lorenlugosch Nice! 💪
0
0
1
@rdesh26
Desh Raj
4 months
@huckiyang @ieeeICASSP Thanks for confirming! I was just confused because my submissions on CMT showed "under review"
0
0
1
@rdesh26
Desh Raj
4 months
@_josh_meyer_ I tried a few of these myself, and I must say Google has really nailed this one. One thing I noticed, however, is that the "male voice" is always the expert in the generated podcasts (or at least in the ones I have seen so far). Would be nice if this could be randomized!
0
0
2
@rdesh26
Desh Raj
4 months
@janekm @awnihannun @kyutai_labs @dongzha35524835 Thanks for the pointer; I was not aware of this work. Yeah, small POCs are easy. But as these folks found: "as we scale to bigger datasets or models, scaling laws may imply greater loss or degeneration." This is the harder problem to solve I think.
1
0
1
@rdesh26
Desh Raj
4 months
@janekm @awnihannun @kyutai_labs I can only comment on the research :) Even academic labs have shown POCs for the idea (e.g., SpeechGPT by @dongzha35524835). It was well known that speech instruct tuning works with discrete tokens. OAI probably also invested in RLHF with speech, which IMO is the next frontier.
1
0
1
@rdesh26
Desh Raj
5 months
RT @Ahmad_Al_Dahle: 📣 You can now have a conversation with Meta AI using voice. It’s super fast, connected to the web, natural and conversa…
0
142
0
@rdesh26
Desh Raj
5 months
RT @ArxivSound: ``M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses,'' Yufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junte…
0
6
0
@rdesh26
Desh Raj
5 months
@girlknowstech Xitter is dead. Only bots roam this space now.
0
0
0
@rdesh26
Desh Raj
5 months
@JonathanLeRoux @ieeeICASSP @IEEE_AASP ICASSP subject areas always baffle me. Why is there only 1 blanket area for "ASR" and why are there no new areas for LLMs? It seems these haven't been updated in 10+ years!
0
0
1
@rdesh26
Desh Raj
5 months
RT @ieeeICASSP: The #ICASSP2025 paper submission deadline is 9 September 2024. No new submissions will be accepted after this deadline. How…
0
23
0
@rdesh26
Desh Raj
6 months
@jxmnop The compute actually happens on super-fast SRAM, so the actual movement of tensors is between the HBM and the SRAM. Unfortunately the SRAM is small and fusing operations is complicated in general, as others have pointed out.
0
0
2
@rdesh26
Desh Raj
6 months
@JonathanLeRoux @ieeeICASSP @IEEEsps Oh this is news to me! Need to update all the overleaf projects now 😅
1
0
3
@rdesh26
Desh Raj
6 months
Every once in a while I find a thread on this website which makes all the spam bearable. Couldn't help sharing!
@izs
isaacs
6 months
Y'all. Seriously. Hardly anyone even understands the unbelievable depth of Tolkien's linguistic genius. If ever there was an artist just straight up making jokes that only he would ever get, omg. So get this, it's epic. You know the "Brandywine River" in Lord of the Rings?
0
0
3
@rdesh26
Desh Raj
6 months
@ziqiao_ma I think it's more about ignorance than privilege.
0
0
0