![Yaroslav Bulatov Profile](https://pbs.twimg.com/profile_images/917082138322788357/EBmj86nx_x96.jpg)
Yaroslav Bulatov
@yaroslavvb
Followers
7K
Following
1K
Statuses
2K
https://t.co/rcTAIKHOLf (ex-Google Brain, OpenAI, Meta) New Blog: https://t.co/SLix8Hrt4w Old Blog: https://t.co/Ur3GWKpmp6
San Francisco, CA
Joined February 2011
@_arohan_ @KhonaMikail I ran into Horace yesterday and he asked about faster QR ... coincidence? My starting point for factorizations is always Nick Higham's "how to" pages --
1
0
7
Keeping up with headline news, which are often negative, makes it easy to lose track of the big picture
How the world has changed over the last century. A compilation of some of our greatest accomplishments as a species. Credit: @toddrjones
1
0
15
From a talk by Chris Manning
'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.
0
3
14
@leloykun Btw regarding zeroing gradient values, this paper does it to something like 99% of values (there's a necessary trick to accumulate the zero offsets until this entry makes it into 1% and gets communicated)
1
1
18
@PalmerLuckey Short-term market fluctuations can be weird...but long-term US startups should benefit because they can incorporate DeepSeek improvements. Also, we are overdue for much cheaper training based on trends observed in
0
0
2
@aerinykim @kchonyc Do measure training data in "raw megabytes" or as terms of " acc improvement"? When I was on Google Books we estimated number of total books to be 150M, but after scanning 20M, benefit of scanning more books became marginal
1
0
1
@ezyang Meta micro-kitchens still well-stocked with caffeine? I discovered Celcius during my time at Meta (by mistake)
1
0
1
@StasBekman @drisspg No official fp8 method....I guess it's still not clear which API is the best? Curious how existing fp8 training libraries do it
1
0
2