trieu Profile
trieu

@thtrieu_

Followers
2,529
Following
135
Media
11
Statuses
1,200

inventor of #alphageometry . lead of alphageometry 2. thinking about thinking @ deepmind.

Mountain View
Joined April 2014
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@thtrieu_
trieu
2 months
happy to contribute!
@GoogleDeepMind
Google DeepMind
2 months
Powered with a novel search algorithm, AlphaGeometry 2 can now solve 83% of all historical problems from the past 25 years - compared to the 53% rate by its predecessor. It solved this year’s IMO Problem 4 within 19 seconds. 🚀 Here’s an illustration showing its solution ↓
Tweet media one
6
38
371
5
5
120
@thtrieu_
trieu
8 months
Proud of this work. Here's my 22min video explanation of the paper:
@GoogleDeepMind
Google DeepMind
8 months
Introducing AlphaGeometry: an AI system that solves Olympiad geometry problems at a level approaching a human gold-medalist. 📐 It was trained solely on synthetic data and marks a breakthrough for AI in mathematical reasoning. 🧵
126
1K
4K
39
154
770
@thtrieu_
trieu
6 years
We show large language models trained on massive text corpora (LM1b, CommonCrawl, Gutenberg) can be used for commonsense reasoning and obtain SOTA on Winograd Schema Challenge. Paper at , results reproducible at
Tweet media one
7
92
303
@thtrieu_
trieu
6 years
As also observed by OpenAI's GPT-2, training data quality is important. We release the STORIES corpus introduced in our work . The corpus is a high quality subset of CommonCrawl with a total of ~7B words (~32GB) can be found here:
Tweet media one
Tweet media two
3
47
178
@thtrieu_
trieu
6 years
Wow! An old project of mine is now the 7th most popular Machine Learning project across all Github in 2018, alongside with Tensorflow and Scikit-learn? I really need to spend some time polishing it now...
@github
GitHub
6 years
From the programming languages you used most to the most popular data science packages, we’re digging into the data on Machine Learning from 2018. Find out what we discovered
Tweet media one
0
76
213
3
1
64
@thtrieu_
trieu
6 years
Human reasoning is not manipulating symbolic expressions
@CIFAR_News
CIFAR
6 years
What is thought?: Big questions from CIFAR Distinguished Fellow Geoffrey Hinton #dlrl2018
Tweet media one
1
18
52
5
17
42
@thtrieu_
trieu
5 years
Had the chance to sit next to Daniel @xpearhead in the early days of the project and tried out the interactive Meena. It has always been *this* surprising and funny :) BIG Congrats to the team with this publication. The possibilities to build up from here is endless.
@kcimc
Kyle McDonald
5 years
um.. google's latest chatbot is 😳
Tweet media one
18
259
1K
1
4
33
@thtrieu_
trieu
6 years
Our work on learning longer-term dependencies is accepted at @icmlconf #icml2018
@lmthang
Thang Luong
7 years
Excited to share a new work by #GoogleAI resident @thtrieu_ (with @iamandrewdai , me, & Quoc Le) on training very long RNNs (up to 16K long). See paper for extreme cases of zero or little backprop on RNNs ;)
Tweet media one
0
45
163
0
9
30
@thtrieu_
trieu
6 years
I'll be presenting my work #ICLR2018 on Wednesday. Come and have a chat :)
Tweet media one
0
3
19
@thtrieu_
trieu
2 years
"We see contributions to traditional conferences and publications in journals as an important part of our work, but also support efforts that go “beyond the research paper"".
@sarahookr
Sara Hooker
2 years
I'm excited to finally share what I have been working on. Today we are officially launching Cohere For AI @forai_ml a non-profit research lab that aims to reimagine how, where, and by whom research is done.
Tweet media one
63
118
820
0
1
13
@thtrieu_
trieu
6 years
A commonsense reasoning task is "solved" even before its official introduction.
@seb_ruder
Sebastian Ruder
6 years
It's amazing how fast #NLProc is moving these days. We have now reached super-human performance on SWAG, a commonsense task that will only be introduced at @emnlp2018 in November! We need even more challenging tasks! BERT: SWAG:
Tweet media one
Tweet media two
8
87
294
0
1
11
@thtrieu_
trieu
8 months
10M toks with almost perfect needle in haystack, amazing.
@JeffDean
Jeff Dean (@🏡)
8 months
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
Tweet media one
195
1K
6K
1
0
7
@thtrieu_
trieu
8 months
Yes, the fact that we can do this means the geometry we considered is quite narrow.
@thomasahle
Thomas Ahle
8 months
Making a synthetic dataset of mathematical proofs is hard! It's easy to make a whole lot of "1+1+1+...=491" style theorems. I'm surprised this method of random construction and transformation finds so many classical geometric theorems. Maybe because the domain is somewhat
4
5
50
0
0
7
@thtrieu_
trieu
6 years
Want to see space-time contract/dilation? This series on Special Relativity (SR) is beautiful. The author squashed space-time to 2D, explained the two postulates by geometric intuition and run a simulator on top of it. It is the 3blue1brown of SR!
0
0
5
@thtrieu_
trieu
5 years
Great paper exploring attention architecture for images! We encounter similar results in our latest work . Table 5 shows that keeping first layers conv, while using attention for last layers improve ResNet performance.
@nikiparmar09
niki parmar
5 years
Further studies show that self-attention is the most useful in later layers while convolutions better capture lower-level features. Combining their will be an interesting research direction.
1
2
18
0
1
6
@thtrieu_
trieu
6 years
OMG a robot doing moonwalk and shuffle at the same time
@gavinsblog
Gavin Sheridan
6 years
Spot the robot dog dancing to UpTown Funk is simultaneously both terrifying and hilarious.
1K
23K
47K
0
0
5
@thtrieu_
trieu
6 years
exciting times
@lmthang
Thang Luong
6 years
A new era of NLP has just begun a few days ago: large pretraining models (Transformer 24 layers, 1024 dim, 16 heads) + massive compute is all you need. BERT from @GoogleAI : SOTA results on everything . Results on SQuAD are just mind-blowing. Fun time ahead!
Tweet media one
13
445
1K
0
0
5
@thtrieu_
trieu
2 years
Formalization still needs humans, but now each human is 100X
@Yuhu_ai_
Yuhuai (Tony) Wu
2 years
Autoformalization with LLMs in Lean! @zhangir_azerbay and Edward Ayers built a chat interface to formalize natural language mathematics in Lean: Very impressive work!
Tweet media one
Tweet media two
5
49
191
0
1
4
@thtrieu_
trieu
1 year
by being a creator of the transformer
@YesThisIsLion
Llion Jones
1 year
I can't beleive I'm starting a company.... with David Ha @hardmaru !! .... in Tokyo?! How did I get here.
15
19
307
0
0
3
@thtrieu_
trieu
8 months
The NYT shares our work with nuanced perspectives from experts in different fields.
@quocleix
Quoc Le
8 months
New York Times article on AlphaGeometry Geometry was my favorite subject in high school because solving them requires many step of reasoning and planning. Geometry problems however have been difficult for AI to solve. Our Nature paper shows my team’s progress in Geometry
10
82
551
0
0
4
@thtrieu_
trieu
2 years
IMO 2022: China maxed out, S.Korea surpassed the US, and Vietnam came so close on the 4th position 🎉! Where are LLMs on this table 🤔
Tweet media one
0
1
4
@thtrieu_
trieu
6 years
Finally some credit assignment
@lmthang
Thang Luong
6 years
This article speaks many, I believe, hidden truths about Quoc Le on GoogleBrain & seq2seq. Personally, I have enjoyed working with Quoc who cares less about credit assignment but rather teamwork and long-term vision :)
2
91
438
0
0
4
@thtrieu_
trieu
3 years
Big LM: Codeforces ✅ IMO ✅
@sama
Sam Altman
3 years
Super excited that OpenAI solved two IMO problems (🤯!), but now I feel extra bad that I never solved one.
Tweet media one
8
30
361
0
0
3
@thtrieu_
trieu
1 year
e = mc2 + lk99
0
0
3
@thtrieu_
trieu
6 years
a dream comes true!
@chrisdonahuey
Chris Donahue
6 years
Excited to announce Piano Genie, an intelligent controller that allows anyone to improvise on the piano! This was my internship project at @GoogleMagenta with @iansimon and @sedielem . Blog post with more information:
23
647
2K
0
0
3
@thtrieu_
trieu
6 years
C A L M
0
0
3
@thtrieu_
trieu
10 months
@jasoncrawford how about a field medalist in combinatorics?
@wtgowers
Timothy Gowers @wtgowers
10 months
This feels like a method that ought to be more generally applicable (as indeed the authors suggest). I have a few ideas of problems I'd be very interested for it to be tried out on and that seem to be of the right type.
2
4
72
0
0
3
@thtrieu_
trieu
3 years
@LisaDeBruine I tried my best, very proud!
Tweet media one
0
0
1
@thtrieu_
trieu
9 months
skillful worldview:
@andersonbcdefg
Ben (e/treats)
9 months
i dont feel bad for ML PhDs who think "my job is ruined, i can't do anything outside a big lab" etc. i can think of >10 low hanging fruit things "someone should do" that im too busy for bc i have to "make revenue". academia is 100% unfettered from use value, yall have it so easy.
5
4
249
0
0
1
@thtrieu_
trieu
10 months
great products: meet, youtube, photos, docs, maps. No competition. curious to see their (inevitable) future competitions though.
@giffmana
Lucas Beyer (bl16)
10 months
Bullish on Google Meet
13
12
330
0
0
2
@thtrieu_
trieu
8 months
amazing & big congrats to the team!
@OpenAI
OpenAI
8 months
Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. Prompt: “Beautiful, snowy
10K
32K
138K
0
0
2
@thtrieu_
trieu
1 year
The next goalpost for AI has moved beyond my own reach...
@3blue1brown
Grant Sanderson
1 year
So close and yet so far
Tweet media one
Tweet media two
62
321
6K
0
0
2
@thtrieu_
trieu
6 years
@chipro Assuming all of your online accounts are linked to your mailbox somehow, is a good option
0
1
2
@thtrieu_
trieu
6 years
Cool work applying RL on raw pixels!
@psermanet
Pierre Sermanet
6 years
Our latest work on continuous control from pixels by @debidatta , Jonathan Tompson, @coreylynch , and I will be presented at the MLPC workshop at #ICRA2018 this afternoon in room M4. Paper:
Tweet media one
2
23
75
0
0
1
@thtrieu_
trieu
1 year
tried this for translation a while back but didnt try hard enough -- i think the transformer needs way more registry & will improve on all tasks. also related to @ChrSzegedy "the transformer is so small" at 52:15 .
@TimDarcet
TimDarcet
1 year
What I mean when I say “registers”: additional learnable tokens (like the [CLS]), but these ones are not used at output. No additional info at input, not used at output: these tokens could seem useless!
Tweet media one
2
8
119
1
0
1
@thtrieu_
trieu
10 months
anyone who is a rationalist must accept this
@AravSrinivas
Aravind Srinivas
10 months
AI has achieved transcendence.
Tweet media one
571
1K
35K
0
0
1
@thtrieu_
trieu
10 months
0
0
1
@thtrieu_
trieu
10 months
@demi_guo_ congrats :)
1
0
1
@thtrieu_
trieu
1 year
failed to search for this song with bard/bing/chatgpt racking my brain with details from the MV. then youtube found it in its first result :)
Tweet media one
0
0
1
@thtrieu_
trieu
6 years
@IrwanBello @benediktbuenz if all scores are 1s, does it matter :) ?
0
0
1
@thtrieu_
trieu
6 years
@psermanet @debidatta @coreylynch Cool work applying RL on raw pixels!
0
0
1
@thtrieu_
trieu
11 months
Tao grounded in formal environment
@leanprover
Lean
11 months
Nice, Terence Tao (Fields Medal 2006) found a bug in one of his papers using Lean 4.
Tweet media one
16
328
2K
0
0
1
@thtrieu_
trieu
2 years
And on some tasks scaling has little effect. Or is it? Maybe the breakthrough point is just a few clicks ahead ... :)
@jaschasd
Jascha Sohl-Dickstein
2 years
Performance on some tasks improves smoothly with model scale, while on others there is sudden breakthrough performance at a critical scale.
Tweet media one
2
12
117
0
0
1
@thtrieu_
trieu
1 year
NYU Courant pays PhD 54K? That's at least 1.5X the NYU CS rate then :)
@WhiterMeerkat
Tony Zhang
1 year
Fellow PhDs! Let’s make our departments pay us a living wage!
6
59
179
2
0
1
@thtrieu_
trieu
10 months
one possible scenario: openai slack channel LTAO browsing X. because their method has nothing to do with q-learning nor rl. unlikely tho, because they are busy shipping q* into gpt5, not browsing X.
1
0
1
@thtrieu_
trieu
1 year
@finbarrtimbers @krishnanrohit maybe they trained on longer sequences where long term dep is important (e.g. large codebases)
1
0
1