Yaroslav Bulatov Profile
Yaroslav Bulatov

@yaroslavvb

Followers
7K
Following
1K
Statuses
2K

https://t.co/rcTAIKHOLf (ex-Google Brain, OpenAI, Meta) New Blog: https://t.co/SLix8Hrt4w Old Blog: https://t.co/Ur3GWKpmp6

San Francisco, CA
Joined February 2011
Don't wanna be here? Send us removal request.
@yaroslavvb
Yaroslav Bulatov
1 day
@_arohan_ @KhonaMikail I ran into Horace yesterday and he asked about faster QR ... coincidence? My starting point for factorizations is always Nick Higham's "how to" pages --
1
0
7
@yaroslavvb
Yaroslav Bulatov
1 day
Keeping up with headline news, which are often negative, makes it easy to lose track of the big picture
@SteveStuWill
Steve Stewart-Williams
2 days
How the world has changed over the last century. A compilation of some of our greatest accomplishments as a species. Credit: @toddrjones
1
0
15
@yaroslavvb
Yaroslav Bulatov
2 days
@ntenenz @YiTayML Or altruistic take - volunteer for the least desirable paper slot, but get more people interested in collaborating with you as a result
1
0
1
@yaroslavvb
Yaroslav Bulatov
4 days
@agarwl_ Is the video of your talk available?
1
0
4
@yaroslavvb
Yaroslav Bulatov
5 days
Are there recorded talks I can watch relevant to DeepSeek?
2
0
6
@yaroslavvb
Yaroslav Bulatov
5 days
From a talk by Chris Manning
@atroyn
anton
13 days
'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.
0
3
14
@yaroslavvb
Yaroslav Bulatov
10 days
@leloykun Btw regarding zeroing gradient values, this paper does it to something like 99% of values (there's a necessary trick to accumulate the zero offsets until this entry makes it into 1% and gets communicated)
1
1
18
@yaroslavvb
Yaroslav Bulatov
10 days
0
0
0
@yaroslavvb
Yaroslav Bulatov
13 days
@PalmerLuckey Short-term market fluctuations can be weird...but long-term US startups should benefit because they can incorporate DeepSeek improvements. Also, we are overdue for much cheaper training based on trends observed in
0
0
2
@yaroslavvb
Yaroslav Bulatov
18 days
@aerinykim @kchonyc Do measure training data in "raw megabytes" or as terms of " acc improvement"? When I was on Google Books we estimated number of total books to be 150M, but after scanning 20M, benefit of scanning more books became marginal
1
0
1
@yaroslavvb
Yaroslav Bulatov
23 days
@dylan522p I'm also in Florida
1
0
6
@yaroslavvb
Yaroslav Bulatov
26 days
@YouJiacheng @kellerjordan0 @YouJiacheng thanks! What's the link to that page?
1
0
0
@yaroslavvb
Yaroslav Bulatov
26 days
@_arohan_ @AIatMeta Congrats! I'm happy that your training expertise can now go towards growing world's AI knowedge rather than being locked behind walls. Also curious what you think about
0
0
5
@yaroslavvb
Yaroslav Bulatov
28 days
@ezyang Meta micro-kitchens still well-stocked with caffeine? I discovered Celcius during my time at Meta (by mistake)
1
0
1
@yaroslavvb
Yaroslav Bulatov
29 days
@StasBekman @drisspg No official fp8 method....I guess it's still not clear which API is the best? Curious how existing fp8 training libraries do it
1
0
2
@yaroslavvb
Yaroslav Bulatov
29 days
@real_shenghuiy Nice view, I was just there! (Pacific science museum this morning)
1
0
1
@yaroslavvb
Yaroslav Bulatov
1 month
@giffmana cc @JvNixon , anything happening in the AGI house?
0
0
1