Vinh Q. Tran Profile Banner
Vinh Q. Tran Profile
Vinh Q. Tran

@vqctran

Followers
1,263
Following
290
Media
2
Statuses
115

research scientist @GoogleDeepMind , all thoughts my own, he/him

Brooklyn, NY
Joined March 2018
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@vqctran
Vinh Q. Tran
6 months
Officially became a Research Scientist this week!! IMO a title doesn't mean much, but never would I have thought I'd be in this position when I first joined Google to visualize distributed systems ~8 years ago. (1/2)
17
1
194
@vqctran
Vinh Q. Tran
1 year
This was a phenomenon we were well aware of when building DSI, but sadly, I suspect this doesn't work in the general case. This paper only looks at datasets answerable with Wikipedia pages, where URLs are generally reformatted versions of their answers / the entity. 1/2
@arankomatsuzaki
Aran Komatsuzaki
1 year
Large Language Models are Built-in Autoregressive Search Engines When providing a few Query-URL pairs as in-context demonstrations, LLMs can generate Web URLs where ~90% of the corresponding documents contain correct answers to open-domain questions.
Tweet media one
2
31
187
2
9
58
@vqctran
Vinh Q. Tran
1 year
Led by our student researcher @rpradeep42 , our new preprint answers the #1 question we receive about DSI & Generative Retrieval, and represents what we currently understand about the paradigm. IMO it's a must read for those looking to do research or application in this area. 1/n
@arankomatsuzaki
Aran Komatsuzaki
1 year
How Does Generative Retrieval Scale to Millions of Passages? Finds that the use of synthetic queries as a document representation strategy is the only approach that remained effective as they scaled up the corpus size using MS MARCO passages.
Tweet media one
1
49
192
1
4
39
@vqctran
Vinh Q. Tran
3 years
A few months ago, we launched the first production Charformer model for Jigsaw's Perspective API. Today, I'm happy to share our latest technical report outlining our techniques and experiments that made this release possible. Paper:
2
3
31
@vqctran
Vinh Q. Tran
4 months
vinhsys evals now open for business
Tweet media one
@YiTayML
Yi Tay
4 months
Reka Core (apr version) is approx 7th place on @lmsysorg . 🚀 After ignoring duplicated models, it's the 5th best model in the world. 🔥
7
8
116
2
1
25
@vqctran
Vinh Q. Tran
2 years
excited to see our hard work on this go out soon!! so happy to be a part of the Bard team
@sundarpichai
Sundar Pichai
2 years
1/ In 2021, we shared next-gen language + conversation capabilities powered by our Language Model for Dialogue Applications (LaMDA). Coming soon: Bard, a new experimental conversational #GoogleAI service powered by LaMDA.
739
3K
15K
0
1
23
@vqctran
Vinh Q. Tran
7 months
Why is simple next-token prediction so effective for pretraining LLMs? It’s more than just learning word statistics! In this work, we show that language has a fractal structure and so predicting the next word actually requires anticipating the patterns at all granularities. (!!)
@ibomohsin
Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن
7 months
How is next-token prediction capable of such intelligent behavior? I’m very excited to share our work, where we study the fractal structure of language. TLDR: thinking of next-token prediction in language as “word statistics” is a big oversimplification!
Tweet media one
14
108
523
1
3
22
@vqctran
Vinh Q. Tran
11 months
Congratulations Yi and @RekaAILabs on the launch!! Going from zero to the most multimodal model to date in 6 months is insane 😮
@YiTayML
Yi Tay
11 months
It’s been a short 6 months since I left Google Brain and it has been a uniquely challenging yet interesting experience to build everything from the ground up in an entirely new environment (e.g., the wilderness) Today, we’re excited to announce the first version of the
84
140
1K
1
3
21
@vqctran
Vinh Q. Tran
6 months
I mostly just wanted to take a moment to give a big thanks to all the friends I made and colleagues that supported me along the way. Especially @YiTayML and @metzlerd who somehow believed in my ideas and taught me The Way of the researcher :D (2/2)
0
1
20
@vqctran
Vinh Q. Tran
1 year
the most powerful truly open source model right now, brb using it in all the papers
@YiTayML
Yi Tay
1 year
New open source Flan-UL2 20B checkpoints :) - Truly open source 😎 No forms! 🤭 Apache license 🔥 - Best OS model on MMLU/Big-Bench hard 🤩 - Better than Flan-T5 XXL & competitive to Flan-PaLM 62B. - Size ceiling of Flan family just got higher! Blog:
51
346
2K
0
2
18
@vqctran
Vinh Q. Tran
2 years
2023: take care of new kidney, bake a perfect canelé, grow a Vietnamese herb garden, make one new close friend, first author one great paper
0
0
17
@vqctran
Vinh Q. Tran
9 months
fun fact: I was working on DSI right after being released from the hospital for losing all remaining kidney function -- semantic ids were thought of and implemented from a dialysis chair!
@agihippo
yi 🦛
9 months
i don't remember which neurips it was but once there was a copycat DSI paper that made minor modifications on DSI and won a best paper award at neurips. the interesting thing was DSI was accepted at the same neurips so it was surely funny.
1
1
9
0
1
10
@vqctran
Vinh Q. Tran
3 years
Super excited to share this work coming out of our group at @GoogleAI . We show that by training a single Transformer model to map from doc content to doc id, we can parameterize an entire retrieval system E2E -- without the need for dual encoders and/or external (MIPS) indices!
@YiTayML
Yi Tay
3 years
Excited to share our latest work at @GoogleAI on "Transformer Memory as a Differentiable Search Index"! TL;DR? We parameterize a search system with only a single Transformer model 😎. Everything in the corpus is encoded in the model! 🙌 Paper:
Tweet media one
10
153
727
1
0
10
@vqctran
Vinh Q. Tran
2 years
Super exciting result from Yi, Mostafa and colleagues. As hyper-optimized as LLMs are these days, UL2 is really showing how much more there is still to be had beyond purely scaling.
@YiTayML
Yi Tay
2 years
Introducing U-PaLM 540B! @GoogleAI Training PaLM w UL2's mixture-of-denoisers with only 0.1% more compute unlocks: - Much better scaling 📈 - Emergent abilities on BIGBench 😎 - Saving 2x compute (4.4 million TPU hours!) 🔥 - New prompting ability link:
Tweet media one
8
87
509
1
3
10
@vqctran
Vinh Q. Tran
1 year
This is such an exciting work from Mahesh's group, really demonstrates the potential of the techniques developed in DSI in other domains!
@madiator
Mahesh Sathiamoorthy
1 year
Happy to share our recent work "Recommender Systems with Generative Retrieval"! Joint work with @shashank_r12 , @_nikhilmehta , @YiTayML , @vqctran and other awesome colleagues at Google Brain, Research, and YouTube. Preprint: #GenerativeAI 🧵 (1/n)
Tweet media one
13
72
476
0
1
9
@vqctran
Vinh Q. Tran
2 years
btw check out these sweet zero-shot haiku-ish poems written by U-PaLM :D
Tweet media one
0
1
9
@vqctran
Vinh Q. Tran
2 years
Awesome video by @ykilcher covering our recent paper!
@ykilcher
Yannic Kilcher 🇸🇨
2 years
🔥New Video on Differentiable Search Index🔥 This paper trains a Transformer to *directly predict docIDs* given a search query. This means the model weights themselves are the index. What does it mean? And how viable is this for real applications? Watch:
Tweet media one
2
39
223
0
0
9
@vqctran
Vinh Q. Tran
6 months
Been wild to hear from Yi how things in the wild are, definitely worth a read!
@YiTayML
Yi Tay
6 months
Long overdue but here's a new blogpost on training LLMs in the wilderness from the ground up 😄🧐 In this blog post, I discuss: 1. Experiences in procuring compute & variance in different compute providers. Our biggest finding/surprise is that variance is super high and it's
Tweet media one
44
254
2K
1
0
8
@vqctran
Vinh Q. Tran
1 year
excited to see the culmination of everyone’s hard work go out and happy to have been along for the ride! this is only the beginning
@sundarpichai
Sundar Pichai
1 year
We're expanding access to Bard in US + UK with more countries ahead, it's an early experiment that lets you collaborate with generative AI. Hope Bard sparks more creativity and curiosity, and will get better with feedback. Sign up:
831
2K
9K
0
2
8
@vqctran
Vinh Q. Tran
9 months
at emnlp if anyone wants to hang!
3
0
8
@vqctran
Vinh Q. Tran
1 year
Passages from Wikipedia are also highly entity-centric, making this mapping easier, and less about memorization. In comparison, generative retrieval works like DSI intend to index arbitrary corpuses of documents. 2/2
1
0
8
@vqctran
Vinh Q. Tran
1 year
to live, eat, and breathe the new york summer with friends and loved ones, what more could you ask for really
0
0
6
@vqctran
Vinh Q. Tran
7 months
@YiTayML Congrats Yi! Got a chance to play with this model, and its truly very impressive!!
1
0
6
@vqctran
Vinh Q. Tran
1 year
@YiTayML Yes but I will leave it as an exercise for the reader
1
1
6
@vqctran
Vinh Q. Tran
4 months
@YiTayML @RekaAILabs once every 4 years Yi Tay writes a game changing eval paper
1
0
6
@vqctran
Vinh Q. Tran
1 year
@agihippo I overheard the same thing how strange 😅
0
0
5
@vqctran
Vinh Q. Tran
1 year
Woah congrats Yi and @RekaAILabs !
@YiTayML
Yi Tay
1 year
We’re coming out of stealth with $58M in funding to build generative models and advance AI research at @RekaAILabs 🔥🚀 Language models and their multimodal counterparts are already ubiquitous and massively impactful everywhere. That said, we are still at the beginning of this
Tweet media one
94
75
925
1
0
4
@vqctran
Vinh Q. Tran
7 months
Not only this, but small fluctuations in Hurst between models turn out to be stronger predictors of downstream performance on challenging zero- and few-shot tasks (BBH, MMLU, GSM8K) than perplexity alone! (Ppl reported in terms of bits-per-byte, BPB.)
@ibomohsin
Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن
7 months
So, can we combine H with BPB to predict downstream performance? Yes: take the average H + 1/BPB (we invert BPB so that higher values are better). This simple average predicts downstream performance much better than perplexity-based BPB alone; especially in Big Bench Hard (BBH)!
Tweet media one
1
1
25
1
1
5
@vqctran
Vinh Q. Tran
2 years
!!!
@YiTayML
Yi Tay
2 years
New UL2 model/paper from @GoogleAI ! "Unifying Language Learning Paradigms" ✅ SOTA on 50-ish diverse NLP tasks ✅ Outperforms GPT-3 175B on 0-shot SGLUE ✅ 3 x perf vs T5 XXL (+LM) on 1-Shot XSUM ✅ Open code & 20B Flax checkpoints. Paper:
Tweet media one
5
70
293
0
0
5
@vqctran
Vinh Q. Tran
1 year
@hwchung27 💯 15in macbook w ssh tmux vim is all you need
0
0
4
@vqctran
Vinh Q. Tran
7 months
Specifically, we use a strong LM (PaLM2-L) to derive fractal parameters for language in various domains, and establish that language is (1) self-similar across timescales and (2) long-range dependent (Hurst ≈ 0.7). We show that this value is robust between 12 different LLMs.
1
0
4
@vqctran
Vinh Q. Tran
1 year
"I would never wish to incorporate this technology into my work at all. I strongly feel that this is an insult to life itself." -miyazaki
@hardmaru
hardmaru
1 year
“Why Studio Ghibli movies can’t be made with AI.” Source: YouTube () Excellent video by Dami Lee. An excerpt:
11
34
200
0
0
4
@vqctran
Vinh Q. Tran
1 year
@YiTayML @dofelite @moyix @m__dehghani Should be updated now, thanks for pointing this out!
1
0
4
@vqctran
Vinh Q. Tran
1 year
@srchvrs Generative Retrieval is a subset of Generative IR; focusing only on first phase retrieval (not reranking, etc.) makes this distinction, but calls it "Generative Document Retrieval", we drop "Document" in recent works as it can apply to other modalities.
1
0
4
@vqctran
Vinh Q. Tran
4 months
in all seriousness congrats Yi!!
0
0
4
@vqctran
Vinh Q. Tran
7 months
Beyond fractals, the possibility of more sophisticated upstream metrics than NLL/PPL is super interesting. Be sure to check out Ibrahim’s thread for a closer look, and the paper for the full details! Truly and deeply enjoyed collaborating on this with @ibomohsin and @m__dehghani .
0
0
4
@vqctran
Vinh Q. Tran
1 year
@aidangomezzz fellow death grips x research enjoyers unite
0
0
4
@vqctran
Vinh Q. Tran
4 months
1
0
4
@vqctran
Vinh Q. Tran
1 year
@YiTayML @dofelite @moyix @m__dehghani Oh sure, will update the readme in the next couple days!
1
0
4
@vqctran
Vinh Q. Tran
2 years
@YiTayML gotta follow the Tay tradition (not you lol)
1
0
4
@vqctran
Vinh Q. Tran
3 years
@GoogleAI The result is a proof-of-concept that performs surprisingly well -- sometimes outperforming dual encoders on a retrieval task over the NQ corpus of 300K docs. Lots of work left to do, but IMO this is a promising first result for an exciting new paradigm.
0
0
3
@vqctran
Vinh Q. Tran
2 years
AI will continue to improve, become more useful for more people, and might even create beautiful things, but people will always cherish the productions of the human spirit
0
0
2
@vqctran
Vinh Q. Tran
1 year
@madmaxbr5 @YiTayML @GoogleAI This is like the original DSI paper were we directly train a Transformer on inputs->docid, but we find that the best inputs are synthetic queries generated from the contents of the document/passage itself (among other things.)
0
0
3
@vqctran
Vinh Q. Tran
2 years
@arankomatsuzaki @YiTayML they perfected the paper template
1
0
3
@vqctran
Vinh Q. Tran
9 months
google
@sanketvmehta
Sanket Vaibhav Mehta, Ph.D.
9 months
Curious about DSI++? Join us in West 2 at 10:00 AM SGT to find out more! #EMNLP2023
Tweet media one
0
1
21
0
1
3
@vqctran
Vinh Q. Tran
1 year
Some of the most fun I’ve had was doing research with Yi at Google, excited to see how your next adventure plays out and congrats! 😄
@YiTayML
Yi Tay
1 year
Over the past 3.3 years at Google, I have been blessed with so many wonderful friendships and experiences. I have grown so much. However, it’s time to move on to a new adventure! I wrote a blogpost about my wonderful experience here:
65
63
990
1
0
3
@vqctran
Vinh Q. Tran
3 years
@LukaszBorchmann @AiParticles @arankomatsuzaki @ytay017 @seb_ruder @GoogleAI Thanks for pointing this out! This is indeed not true anymore for the current version of the model, we've updated the preprint and it should be up on Monday
1
0
2
@vqctran
Vinh Q. Tran
8 months
@seb_ruder @cohere Congrats Sebastian!
0
0
0
@vqctran
Vinh Q. Tran
8 months
@YiTayML Congrats Yi!! I’m so happy for you 🥲
1
0
1
@vqctran
Vinh Q. Tran
9 months
@Swarooprm7 @YiTayML So glad you randomly reached out to chat!
0
0
2
@vqctran
Vinh Q. Tran
1 year
4. Is this because the corpus is too big to encode in the parameters of the model? Surprisingly, performance maxes out at 3B parameters, and does slightly worse at 11B. More research is needed to unlock the power of larger language models. 5/n
1
0
2
@vqctran
Vinh Q. Tran
1 year
@agihippo all the talk of LLMs distilling from collective human experience has been giving me human instrumentality project vibes (evangelion)
0
0
2
@vqctran
Vinh Q. Tran
1 year
@srchvrs @YiTayML FWIW generative retrieval is more inclusive of previous and concurrent related works that do not adopt "DSI" (GENRE, SEAL, NCI, etc.) It also makes more sense outside the context of IR, e.g. recommender systems, vision.
0
0
2
@vqctran
Vinh Q. Tran
1 year
@vqctran
Vinh Q. Tran
2 years
AI will continue to improve, become more useful for more people, and might even create beautiful things, but people will always cherish the productions of the human spirit
0
0
2
0
0
1
@vqctran
Vinh Q. Tran
9 months
@agihippo It’s so great to finally meet you in person after all we’ve done together!!
0
0
2
@vqctran
Vinh Q. Tran
1 year
Please see the paper for more intuitions, analysis, and nuance. Shout out to great collaborators @rpradeep42 , @kaihuibj , @_j_ai , Adam D. Lelkes, @HongleiZhuang , @lintool , & @metzlerd ! n/n
0
0
2
@vqctran
Vinh Q. Tran
6 months
@agihippo Yes please
1
0
2
@vqctran
Vinh Q. Tran
6 months
@Matchstickcode @YiTayML @YiTayML 's next era as a Twitch Streamer
1
0
1
@vqctran
Vinh Q. Tran
1 year
3. After establishing these best practices, we scale up corpus size, and observe that the paradigm significantly struggles past 1M passages. Previously proposed techniques that seem to work only work because they add on more parameters, and lose to scaling up naive methods. 4/n
1
0
1
@vqctran
Vinh Q. Tran
2 years
@blizzstud sounds gr8
0
0
1
@vqctran
Vinh Q. Tran
1 year
@_badabui frog on the floor
1
0
1
@vqctran
Vinh Q. Tran
1 year
@hwchung27 the one case where shallow MacBook keys are actually a feature!
0
0
1
@vqctran
Vinh Q. Tran
1 year
1. We ablate several techniques in the literature and find that document representation is the most important factor, and by ablating for the right one we happen to set a new SOTA result for small corpus retrieval (NQ, 100k docs), without the need for any fancy methods (lol) 2/n
1
0
1
@vqctran
Vinh Q. Tran
5 months
@agihippo is it the research or is it the people?
0
0
1
@vqctran
Vinh Q. Tran
4 months
@agihippo nylon coffee in Everton park, the community coffee in orchard, dayglow in LA, cocoa cinnamon in North Carolina
0
0
1
@vqctran
Vinh Q. Tran
9 months
0
0
1
@vqctran
Vinh Q. Tran
6 months
@YiTayML Thank you Yi 😊
1
0
1
@vqctran
Vinh Q. Tran
1 year
2. On MS MARCO passages, using synthetic queries generated from the passages as the document representation is the *only* technique necessary for retrieval performance (+ appropriate model scale). This is competitive with SOTA dual encoders (i.e. GTR) on the 100k subset. 3/n
1
0
1
@vqctran
Vinh Q. Tran
9 months
@agihippo i opted to hang out at changi jewel for my last four hours in singapore instead of watching the best paper results :)
0
0
1
@vqctran
Vinh Q. Tran
11 months
@agihippo @hwchung27 I still use your style of bash script to this day LOL
1
0
1
@vqctran
Vinh Q. Tran
1 year
@YiTayML maybe, maybe not, but I definitely know distillation has a new name!
0
0
1
@vqctran
Vinh Q. Tran
6 months
@agihippo "Length-free NLP"
1
0
1
@vqctran
Vinh Q. Tran
1 year
@blizzstud do u need investors for your bar
0
0
1
@vqctran
Vinh Q. Tran
4 months
@YiTayML Congratulations Yi!! The vibes are excellent on this one
1
0
1