Yaroslav Bulatov @yaroslavvb profile

Yaroslav Bulatov

@yaroslavvb

Followers

7K

Following

1K

Statuses

2K

https://t.co/rcTAIKHOLf (ex-Google Brain, OpenAI, Meta) New Blog: https://t.co/SLix8Hrt4w Old Blog: https://t.co/Ur3GWKpmp6

San Francisco, CA

Joined February 2011

Don't wanna be here? Send us removal request.

Yaroslav Bulatov

@yaroslavvb

1 day

@_arohan_ @KhonaMikail I ran into Horace yesterday and he asked about faster QR ... coincidence? My starting point for factorizations is always Nick Higham's "how to" pages --

1

0

7

Yaroslav Bulatov

@yaroslavvb

1 day

Keeping up with headline news, which are often negative, makes it easy to lose track of the big picture

Steve Stewart-Williams

@SteveStuWill

2 days

How the world has changed over the last century. A compilation of some of our greatest accomplishments as a species. Credit: @toddrjones

1

0

15

Yaroslav Bulatov

@yaroslavvb

2 days

@ntenenz @YiTayML Or altruistic take - volunteer for the least desirable paper slot, but get more people interested in collaborating with you as a result

1

0

1

Yaroslav Bulatov

@yaroslavvb

4 days

@agarwl_ Is the video of your talk available?

1

0

4

Yaroslav Bulatov

@yaroslavvb

5 days

Are there recorded talks I can watch relevant to DeepSeek?

2

0

6

Yaroslav Bulatov

@yaroslavvb

5 days

From a talk by Chris Manning

anton

@atroyn

13 days

'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.

0

3

14

Yaroslav Bulatov

@yaroslavvb

10 days

@leloykun Btw regarding zeroing gradient values, this paper does it to something like 99% of values (there's a necessary trick to accumulate the zero offsets until this entry makes it into 1% and gets communicated)

1

18

Yaroslav Bulatov

@yaroslavvb

10 days

@cloneofsimo @giffmana

0

Yaroslav Bulatov

@yaroslavvb

13 days

@PalmerLuckey Short-term market fluctuations can be weird...but long-term US startups should benefit because they can incorporate DeepSeek improvements. Also, we are overdue for much cheaper training based on trends observed in

0

2

Yaroslav Bulatov

@yaroslavvb

18 days

@aerinykim @kchonyc Do measure training data in "raw megabytes" or as terms of " acc improvement"? When I was on Google Books we estimated number of total books to be 150M, but after scanning 20M, benefit of scanning more books became marginal

1

0

1

Yaroslav Bulatov

@yaroslavvb

23 days

@dylan522p I'm also in Florida

1

0

6

Yaroslav Bulatov

@yaroslavvb

26 days

@YouJiacheng @kellerjordan0 @YouJiacheng thanks! What's the link to that page?

1

0

Yaroslav Bulatov

@yaroslavvb

26 days

@_arohan_ @AIatMeta Congrats! I'm happy that your training expertise can now go towards growing world's AI knowedge rather than being locked behind walls. Also curious what you think about

0

5

Yaroslav Bulatov

@yaroslavvb

28 days

@ezyang Meta micro-kitchens still well-stocked with caffeine? I discovered Celcius during my time at Meta (by mistake)

1

0

1

Yaroslav Bulatov

@yaroslavvb

29 days

@StasBekman @drisspg No official fp8 method....I guess it's still not clear which API is the best? Curious how existing fp8 training libraries do it

1

0

2

Yaroslav Bulatov

@yaroslavvb

29 days

@real_shenghuiy Nice view, I was just there! (Pacific science museum this morning)

1

0

1

Yaroslav Bulatov

@yaroslavvb

1 month

@giffmana cc @JvNixon , anything happening in the AGI house?

0

1