yi 🦛 Profile Banner
yi 🦛 Profile
yi 🦛

@agihippo

Followers
3,621
Following
81
Media
122
Statuses
2,620

peacetime hippo ☕, healing at fountain ⛲, slice of researcher life ☀️, trying to be healthy 🍗, AI research enjoyer 🧠, friend of @agikoala .

AGI zoo
Joined March 2023
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@agihippo
yi 🦛
2 years
This is an account for my everyday life and potentially some fringe content about AI/tech. Will tweet at a higher frequency compared to my main acct. So follow only if you want to know more about my personal life, rants, whatever.. Also thanks to @agikoala for the great idea.
1
1
20
@agihippo
yi 🦛
1 year
BatGPT from Wuhan university. For real?? 😲
Tweet media one
54
91
862
@agihippo
yi 🦛
1 year
This is what happens when tech bro pretends to be researcher. Yes yes got to be the sliding window attention. 🤣🤣
@hrishioa
Hrishi
1 year
Been pretty excited waiting for @MistralAI 's new paper about how the model is able to beat (in all of our tests) models 3-10x the size. Sliding Window Attention seems to be the main reason - and it's genius. Let me explain why it's brilliant and what I understand.
Tweet media one
18
78
657
27
30
572
@agihippo
yi 🦛
20 days
A good summary thread but this is not the "AI research community". This falls under startup tech bro grifter club. Legitimate people smell this stinky vibe from a mile away.
@shinboson
𝞍 Shin Megami Boson 𝞍
20 days
A story about fraud in the AI research community: On September 5th, Matt Shumer, CEO of OthersideAI, announces to the world that they've made a breakthrough, allowing them to train a mid-size model to top-tier levels of performance. This is huge. If it's real. It isn't.
Tweet media one
115
733
7K
5
6
232
@agihippo
yi 🦛
4 months
Real AI people don't call themselves weird shit like "AI experts", "AI visionaries"..."AI thought leader"... They call themselves "member of technical staff", "research scientist" or something like that.
14
6
209
@agihippo
yi 🦛
7 months
strong 7bs: mistral, gemma, reka edge mid 7bs: llama2 cautionary tales: mpt, falcon, dolly, olmo
15
7
193
@agihippo
yi 🦛
8 months
Google was always leading and dominating in LLMs and AI. The only difference in the recent past is that so many clueless people entered the field and spammed their 50 IQ takes all over social media.
16
6
189
@agihippo
yi 🦛
20 days
The community needs some self-reflection on how they fell for some grifter model 😂
13
2
183
@agihippo
yi 🦛
8 days
Occupational hazard of a ML researcher: I'm starting to track my calories (for better health) and I can't resist the urge to do a taste vs cost (calorie) plot and only eat the foods on the pareto frontier. 😶
12
2
186
@agihippo
yi 🦛
10 months
The new meta for PhD students is to take a well known benchmark, gpt3.5 turbo it, fine-tune a model to beat some well known incumbents and get an emnlp paper. Despicable.
5
10
169
@agihippo
yi 🦛
1 year
Stack ranking of the transformer infra that I have used across the past years 1. T5x + seqio (2021-2023) 2. Mesh tensorflow + seqio (2019-2021) 3. Tensor2tensor (2018-2019) 4. Pax (2022-2023) 5. Fairseq (2019) 6. Megatron-LM 7. Random tf1/tf2 codebases 🥲 8. HF transformers
11
7
154
@agihippo
yi 🦛
1 year
It sucks being an expert in this ai/LLM hype thing because u cringe so much at horridly misinformed bad takes. It's everywhere. Even VCs are writing their own surveys of AI now wtf?? Someone save me.
9
6
151
@agihippo
yi 🦛
1 year
lmao. from a professor. academia is like the worse place to be. no impact and boring AF. bigtech is the place for impact, startup for upside. but academia? maybe for retirement.
@MattNiessner
Matthias Niessner
1 year
PhD graduates in AI mostly take boring jobs at big tech companies due to short-term monetary incentives. While understandable to some degree, it's also quite sad to see so many great researchers 'disappear' and give up their talent - join or do your own startup instead!
47
46
567
11
3
146
@agihippo
yi 🦛
1 year
i have to say that from my time as a phd student to my time at Google and now at a startup, I have never felt like i worked a single day because research/ML is just so much fun.
6
2
146
@agihippo
yi 🦛
4 months
tokenizers are the most awkward part of large language models
11
8
145
@agihippo
yi 🦛
1 year
i have a friend who left one of the LLM war companies to join finance/trading as a ML guy and what they told me it's like finding a nice little meadow to chill at while everyone else is fighting a major war.
2
2
141
@agihippo
yi 🦛
1 year
It's kind of also why twitter is terribly annoying for actual researchers and legitimate people.
4
2
140
@agihippo
yi 🦛
3 months
The biggest flex of a senior person in AI research is: "I did not contribute so don't include me in the paper." So many people in google behaved like this. Very respectable! Meanwhile in other parts of the world....🤨 Folks discuss how to build large teams for many papers 🙃
4
4
139
@agihippo
yi 🦛
7 months
I think most decent technical people can see through the noise and drama with Gemini image and recognise the technical feat of Gemini 1.5. Those who can't, not sure if people should really care about what they think.
@AndrewYNg
Andrew Ng
7 months
To all my Google friends: I know this week has been tough with a lot of criticism about Gemini's gaffes. Just wanted to say I love all of you and am rooting for you. I know everyone means well, and am grateful for your work & eager to see where you next take this amazing tech!
281
221
4K
5
3
134
@agihippo
yi 🦛
6 months
rip all my friends from google brain who joined inflection when it was hot now gets to say they work at microsoft :(
13
3
130
@agihippo
yi 🦛
11 months
Haha! The silver lining of all the LLM noise is that you get papers like this that you typically wouldn't see in academia. 😂 LLMs are entering the era of "natty or not".
3
7
128
@agihippo
yi 🦛
1 year
It's really funny to see some senior professors get invited to give talks and they scramble to say something smart about LLMs and generative AI when they have absolutely no experience or clue about what's going on. 🤣
5
9
121
@agihippo
yi 🦛
16 days
After o1 model will we see j1 model and h1b model? 😄
7
4
122
@agihippo
yi 🦛
2 months
if you're an LLM engineer/ researcher and you go to bed with a substantial amount of idle nodes, you have sinned.
8
8
113
@agihippo
yi 🦛
9 months
many people don't know but some high profile yet "not so senior" (think L5/L6) RS-es can't really code or technically contribute to projects. they're literally only capable of editing papers and putting their name on many papers by getting involved in many projects.
6
2
112
@agihippo
yi 🦛
2 months
research/engineering output is all about sequentially consistent & productive actions to move towards a goal. what i learned is that people could be decent coders but fail to be productive because they have no macro sense of how to move forward at all.
3
3
110
@agihippo
yi 🦛
1 year
Is the next paper gonna be co-ViT?
4
2
105
@agihippo
yi 🦛
11 months
It's actually crazy to think that the top 2 definitive & most incorporated transformer mods (swiglu & mqa/gqa) are both proposed by Noam. Everything else hasn't a definitive consensus yet,e.g., rope, parallel layers etc. We're still waiting for the game changer relpos.
6
1
105
@agihippo
yi 🦛
1 year
Fwiw I wanted to add that I'm pretty proud of myself for today's milestone. You have no idea how painful it was to be tripping all over "git pip conda cuda gpus regular Linux stuff right etc" after I left G. It was so painful. Like learning to walk again after a car accident.
8
0
103
@agihippo
yi 🦛
8 months
Jason wei: you don't need a PhD to be a top-tier AI researcher Jerry wei: you don't need an undergrad degree to be a top-tier AI researcher 🔥🔥
@JerryWeiAI
Jerry Wei
8 months
Today marks my first year at Google (DeepMind). One year ago today, I joined Google Brain as a student researcher and first started working on large language models. During my time as a student researcher, I investigated how larger language models can do in-context learning
9
13
271
4
5
102
@agihippo
yi 🦛
10 months
gemini is so cool and impressive but i want this to go up instead of mmlu.
Tweet media one
8
4
100
@agihippo
yi 🦛
3 months
Sleep is good. It's when gradients are being applied.
8
9
99
@agihippo
yi 🦛
1 year
peer review & submitting to conferences is kinda like bad RLHF for researchers. it's like optimizing for strange artifacts/quirks in the review system which makes research inherently wonky. just put it on arxiv & submit to a conf if you fancy a vacation in that location.
5
4
92
@agihippo
yi 🦛
5 months
Gemini flash? Lol really? Good to know I'm still contributing to google names after being a xoogler for some time. 🤔
6
1
100
@agihippo
yi 🦛
6 months
The research and LLM landscape these days is getting quite boring tbh.
13
1
99
@agihippo
yi 🦛
7 months
there is no way the 9-5, strict 40 hour 5-day week is a workable schedule for anyone training LLMs or being involved in this. it's a sport. you have to be on call all the time.
7
4
95
@agihippo
yi 🦛
4 months
I miss fundamental research.
4
3
97
@agihippo
yi 🦛
10 months
Happy to see my friends @_jasonwei @JerryWeiAI in person again! 😃
Tweet media one
5
1
93
@agihippo
yi 🦛
4 months
I actually think technical staff deserves to fly business more than business staff. Airlines should rename or make a new class of flights called technical IC class that is more luxurious than current business. We deserve it man.
10
2
93
@agihippo
yi 🦛
10 months
What did ilya see?
3
5
94
@agihippo
yi 🦛
11 months
There are only two reasons to use "GPT" in your model name: 1. you are OAI. 2. you are not OAI and have no self-respect.
4
5
93
@agihippo
yi 🦛
9 months
2023 was the year I broke free. when i realised there is no end game to publishing papers and increasing citations. i didn't check my eoy citation count. it's pointless. do research and savour it intrinsically. write code, build together & enjoy the lifestyle.
2
0
94
@agihippo
yi 🦛
20 days
I have to break my healing mode nice hippo sabbatical because I feel compelled to talk about the grifter drama. 🥲 there's just so many good lessons there the common folk could learn from!
7
3
93
@agihippo
yi 🦛
1 year
Q: what did you do for your phd? A: i spent all my time analysing whenever a startup make some tweaks to their API and wrote a paper about it. closed source orgs are so mysterious now that it seems like a research question altogether to figure out what they are doing.
@omarsar0
elvis
1 year
How is ChatGPT’s behavior changing over time? If you are developing with LLMs or in this case GPT-3.5 or GPT-4, it's definitely worth taking a look at this report. There is suspicion in the AI community that models like GPT-4 are changing/degrading in performance and behavior.
Tweet media one
12
39
185
5
4
90
@agihippo
yi 🦛
6 months
frontier level models from scratch shipped by the gpu poor.
@RekaAILabs
Reka
6 months
Meet Reka Core, our best and most capable multimodal language model yet. 🔮 It’s been a busy few months training this model and we are glad to finally ship it! 💪 Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body
53
242
1K
8
5
87
@agihippo
yi 🦛
10 months
ideas are cheap, making them work is what counts. 👀
@SchmidhuberAI
Jürgen Schmidhuber
10 months
How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. More than a dozen concrete AI priority disputes under
Tweet media one
48
131
954
3
1
85
@agihippo
yi 🦛
1 month
Managed to go down from 80kg to 72kg in <2 mo! 😄😄 1. Eating high protein meals. 3 eggs to replace 1 meal a day and spamming protein shake whenever hungry. 2. Some exercise. Hiked with @swyx and tried random shit like rowing 3. Unf*king my sleep cycle. Feeling better!
@agihippo
yi 🦛
3 months
realised ive gained so much weight ever since becoming a dad. late nights, increased workload, eating more and shifting priorities to care for the baby etc. need to start going back to a healthier life or else i'll "scale up" to no return 🫤😐
9
0
56
6
3
86
@agihippo
yi 🦛
8 months
Bunch of nonsense 😥🤦‍♀️
@IntuitMachine
Carlos E. Perez
8 months
1/n Have you ever wondered why decoder-only Transformer models like GPT-4 have dominated over other Transformer models like encoder-only (ex: BERT) or encoder-decoder models (ex: Flan T5)? What is the intuitive explanation for this? To understand their supremacy, consider how
11
58
311
8
0
86
@agihippo
yi 🦛
5 months
15 people? That's cute. 😬 At reka our pretraining team is 3-5 people at max, who were all also working >50% time in other projects. 🫠
@HeinrichKuttler
heiner
5 months
Our latest model Inflection-2.5 () is not bad. In fact, it was the ~4th best publicly "known" models when it was released in early March. And it was created by our pretraining team of < 15 people! 2/
1
1
22
8
3
83
@agihippo
yi 🦛
8 months
Wth is this? Clearly everyone sees what's going on right? 🤦‍♀️👀
Tweet media one
8
3
83
@agihippo
yi 🦛
5 months
Evals push the field forward.
@YiTayML
Yi Tay
5 months
New paper from @RekaAILabs 🔥 (yes an actual paper). This time we're releasing part of our internal evals which we call Vibe-Eval 😃 This comprises of a hard set which imo is pretty challenging for frontier models today. The fun part here is that we constructed it by trying to
Tweet media one
22
86
575
3
5
83
@agihippo
yi 🦛
11 months
Random thought I had today: if only given model weights (i.e., runnable model) and no other details, is it possible to determine how many tokens the model has seen during pretraining?
16
2
82
@agihippo
yi 🦛
4 months
People tout open source like some kind of balancing act to powerful frontier labs. This is romanticised and dramatized. Open source has not made any real impact except wagging their tails and drooling their tongues waiting for zuck to drop llama crumbs.
@absoluttig
John Luttig
4 months
despite recent progress and endless cheerleading, open-source AI is a worsening investment for model builders, an inferior option for developers and consumers, and a national security risk. I wrote about the closed-source future of foundation models here
130
37
325
10
2
83
@agihippo
yi 🦛
2 months
A family member passed away today. Reminder to self that life is too short to not try to spend more time with loved ones.
9
0
83
@agihippo
yi 🦛
1 year
Sorry startup bros, hacking together a quick and dirty oai wrapper sass and screaming random things like arr, mrr, product market fit is not something respectable at all. Even the rest and vest folks at big co deserve more respect.
5
6
77
@agihippo
yi 🦛
5 months
I keep seeing factorized Jeff Dean why??
Tweet media one
9
1
80
@agihippo
yi 🦛
1 month
many of the research breakthroughs come from google, which codebases and infra are light years ahead of most other companies (and universities). given this, i would think that a huge number of research breakthroughs are close to production ready the moment they are born.
@jxmnop
jack morris
1 month
an underrated benefit of doing research is having freedom to create your own process, and only where necessary; no one forces you to optimize or organize some of the biggest research breakthroughs have come from codebases so chaotic they would make a level 5 FAANG engineer faint
6
5
177
2
2
80
@agihippo
yi 🦛
1 year
Of all LLM infra I've used both inside and outside Google..have to say t5x + seqio is still miles ahead of everything out there. I have a soft spot for the deprecated mesh tensor flow too.
6
6
79
@agihippo
yi 🦛
3 months
sometimes i wonder to myself if i should buy like a console and start playing games or something then i remember: being an ai researcher / engineer is just like playing games for work all day long.
@KevinNaughtonJr
Kevin Naughton Jr.
3 months
being a software engineer is basically just getting paid to solve puzzles all day
123
770
8K
5
2
80
@agihippo
yi 🦛
11 months
I got literally 100 pings today to tell me a llm was named after me, or that I have become a llm. 🥲🥲🥲 Now my first name and last name have all been taken by chatbots. Happy now?
6
0
78
@agihippo
yi 🦛
23 days
I wanted to post some shade about some mega grifter new llm model I saw on X recently but I'm too busy lifemaxxing to bother these days 🙃
7
1
78
@agihippo
yi 🦛
6 months
people have moved from wasting money at smaller scale to wasting money at larger scale. 🧱🧱🧱
7
0
75
@agihippo
yi 🦛
10 months
There's no such thing as seniority in this age of AI. It's either you contribute or you don't. No empire building, medal collecting, title amassing whatever. It's just all about what code did you write? And which LLM did you contribute to?
4
7
76
@agihippo
yi 🦛
4 months
PSA: Seriously, the meta for PhD students in under prestige/visible universities is to gain the respect/mentorship from just 1 semi-visible industry RS at a top lab & coauthor 1 good work together. Not publishing more papers with your unknown advisor that no one will read.
3
1
75
@agihippo
yi 🦛
6 months
Happy birthday to me!
12
0
75
@agihippo
yi 🦛
3 months
maybe character ai is the closest to agi
10
2
74
@agihippo
yi 🦛
11 months
Whenever asked I still tell people my job is a research scientist (technically true) I somehow cannot seem to identify as an entrepreneur. Research scientist vibes well with me. Making money is cool only when you do it in a cool way.
2
0
73
@agihippo
yi 🦛
1 month
this is the perfect response to "how do i get into ai"
@yacineMTB
kache
9 months
you realize that you can just do things right? like you can just do them. just do things
51
124
1K
3
3
74
@agihippo
yi 🦛
10 months
thoughts: 1. no one cares about papers accepted at confs anymore. mainly just for fun/vacation. 2. citations are more important but not that important too. the new meta is just being in the place to work on a frontier model. (e.g., gpt4, gemini etc). trumps everything else.
@ZenMoore1
Zekun Wang (Seeking 25Fall PhD/Job) 🔥
10 months
Observed that some LLM researchers prioritize gaining more citations over having more papers accepted in conferences/journals. What are your thoughts?
0
1
2
5
5
72
@agihippo
yi 🦛
1 year
the best open source model from Google is my work. What have you done? 😃
@andriy_mulyar
AndriyMulyar
1 year
@agihippo just like models are supposed to be open sourced but you ain't doing it
1
0
3
9
1
73
@agihippo
yi 🦛
1 month
My friend told me they are expecting twins and I'm like congrats on batch size 2.
5
2
72
@agihippo
yi 🦛
2 months
Gonna try healthmaxxing for the next few months. Any suggestions what I should do other than sleep a ton and exercise a lot and eat healthy?
27
0
69
@agihippo
yi 🦛
6 months
human evaluation results just came in and today we hit a nice model performance milestone at reka. its one of the model goals we've had since starting the company.
7
1
72
@agihippo
yi 🦛
6 months
Sounds like a low bar to meet but you'll be surprised how many people don't even meet this bar.
@kaiokendev1
kaio ken
6 months
your moat is that you care
15
79
493
3
7
67
@agihippo
yi 🦛
1 month
You can be doing objectively well on certain dimensions in life but if your internal reward model doesn't align you'll end up just hating yourself or feeling something is off.
4
2
68
@agihippo
yi 🦛
6 months
Aside from incumbents (gdm, oai, anthropic) I think only a few teams managed to train strong models. Xai, inflection, character, mistral and us (reka). Meta will depend on how llama3 lands. Everything else is NPC tier.
9
2
67
@agihippo
yi 🦛
6 months
Being new to the startup world ive always treated raising money as a measure of success. Today I learned that you can raise 1.3B and still fail spectacularly. 🙃
2
3
66
@agihippo
yi 🦛
3 months
LLM not following instructions? Wait till you start interacting with humans! 🙃
2
3
66
@agihippo
yi 🦛
3 months
Big conundrum of benchmark creators. - make full private = tough adoption - private and public = everyone reports public and ignores private - fully public - dataset gets totally wrecked by researcher descent. 2nd option eventually becomes the 3rd. What's the way out?
@WenhuChen
Wenhu Chen
3 months
A sad truth about evaluation is that: If you make a private test set for your benchmark, people just won't adopt it. We have our official MMMU private test set hosted in EvalAI (), but everyone is still reporting validation score. I found it's similar for
9
12
203
13
1
66
@agihippo
yi 🦛
6 months
people ask me why edge is trained on 4.5T tokens and flash on 5T tokens. the simple reason could be that our 7b job died somehow at 4.5 tokens and I was lazy to restart it. could be as simple as that. don't over think stuff!
4
2
64
@agihippo
yi 🦛
6 months
Holy. This idea is so outrageously cool and insane.
@arankomatsuzaki
Aran Komatsuzaki
6 months
Google presents Training LLMs over Neurally Compressed Text - Outperforms byte-level baselines by a wide margin - Worse PPL than subword tokenizers but the benefit of shorter sequence lengths
Tweet media one
4
72
504
2
4
64
@agihippo
yi 🦛
2 months
If you use up all your compute all the time and maximize information gain each time, it's an effective strategy to becoming a productive ML researcher.
@finbarrtimbers
finbarr
2 months
evaluating ml researchers by GPU utilization is honestly a good metric
9
3
119
3
1
64
@agihippo
yi 🦛
3 months
Reka is cracked because we have L.
Tweet media one
3
1
63
@agihippo
yi 🦛
6 months
gonna celebrate my baby daughter's 100 day birthday tomorrow. feeling so blessed.
5
1
62
@agihippo
yi 🦛
7 months
Maybe GPUs and cuda being bad is a feature not a bug 😶🙃. Definitely tons of suffering. Maybe by design. I sure became more resilient after starting to use these. 🥹🥹
@unusual_whales
unusual_whales
7 months
Jensen Huang, $NVDA, CEO: “Resilience matters in success. I don’t know how to teach it to you, except for: I hope suffering happens to you .. because .. greatness comes from character. And character isn’t formed out of smart people. It’s formed out of people who suffered.
184
2K
11K
4
2
61
@agihippo
yi 🦛
1 year
I'm sick and tired of people pitching me alternative AI. The only AI you should care about are LLMs and generative models. Everything else is frankly noise and mostly crap.
9
1
61
@agihippo
yi 🦛
5 months
Got to say llama went from amateur tier to actually good tier with llama3.
2
1
60
@agihippo
yi 🦛
3 months
The real objective function in life is that you're happy and your loved ones are happy. Nothing else matters.
1
1
59
@agihippo
yi 🦛
4 months
Good taste is so important
0
3
59
@agihippo
yi 🦛
1 month
tech people: member of technical staff 🫡👌 non tech people: self-inflates titles into some random director/vp on linkedin or something. seen this self-appointment happening so much 😂🤦‍♂️
6
2
59
@agihippo
yi 🦛
7 months
What kind of person opportunistically names their paper "Sora". It's a survey paper damnit! Did you hope to be cited or get attention by some mix up? Do you have negative self respect?
@_akhaliq
AK
7 months
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The model is trained to generate videos of realistic or imaginative scenes from text instructions and
Tweet media one
17
214
1K
3
1
58
@agihippo
yi 🦛
9 months
LLM predictions for 2024 - there will still be many noob/clueless LLM influencers/semi-influencers posting bad/wrong takes. 🤭
1
2
58
@agihippo
yi 🦛
4 months
Overheard from a friend "it's funny that Yann lecun said he published 80 papers since 2022. I checked his google scholar and none of them are actually good". 🥴🙃😛
9
0
56
@agihippo
yi 🦛
5 months
so tired of llms. wanna learn how to be a barista.
10
1
57
@agihippo
yi 🦛
8 months
was i happier in 2017, as an unknown grad student. going to confs like an invisible entity. no expectations of career, money whatever. i was making a grad student stipend then, poor AF but something felt so peaceful about the days back then.
4
0
56
@agihippo
yi 🦛
3 months
realised ive gained so much weight ever since becoming a dad. late nights, increased workload, eating more and shifting priorities to care for the baby etc. need to start going back to a healthier life or else i'll "scale up" to no return 🫤😐
9
0
56
@agihippo
yi 🦛
5 months
Xlstm paper is actually pretty cool. Hope it works!
0
4
56
@agihippo
yi 🦛
6 months
What a shit show. But have to say I chipped away maybe 30 bucks of his runway with the duck i ordered at the neurips stability dinner. 🙃
1
1
55