Pavel Surmenok Profile
Pavel Surmenok

@surmenok

Followers
2K
Following
75K
Media
91
Statuses
6K

Autoilot / AI at Tesla

Redwood City, CA
Joined July 2009
Don't wanna be here? Send us removal request.
@surmenok
Pavel Surmenok
2 months
FSD V13: point-to-point self-driving without touching steering wheel or pedals. A large deep neural network trained with a large dataset end-to-end: photons in, controls out.
@AIDRIVR
ΛI DRIVR
2 months
FSD 13 leaves parking lot (+ awkward interaction with other driver). the smoothness is absolutely INSANE. it also saw the Model 3 backing up before I did, I was wondering why it wasn’t moving lol
5
13
275
@surmenok
Pavel Surmenok
2 months
@realGeorgeHotz Why would I buy KIA?.
7
1
200
@surmenok
Pavel Surmenok
10 months
@Austen One thing to check: does it run Windows or Linux.
8
0
167
@surmenok
Pavel Surmenok
1 year
Once upon a time, I interviewed a seasoned ML engineer, asked him “what do you think about batch norm”. He looked at me with eyes full of painful memories and laughed. Then I knew that he is an expert.
7
2
152
@surmenok
Pavel Surmenok
10 months
@jeremyphoward Autoregressive vs. diffusion makes more sense.
3
0
150
@surmenok
Pavel Surmenok
8 months
@dividendology One is sum of all savings, another is growth rate. 2nd image as sum of savings would look more like this. Still noisy, but not as much.
Tweet media one
2
5
143
@surmenok
Pavel Surmenok
1 month
@_apoorvnandan That’s literally step 1 of they @karpathy playbook.
7
0
144
@surmenok
Pavel Surmenok
10 months
@SergioRocks The question is false. AI is a tool. Should judge impact and quality of work.
1
0
133
@surmenok
Pavel Surmenok
5 months
@tamaybes Can you please retrain the model to make sure there are no issues with upload.
0
0
116
@surmenok
Pavel Surmenok
4 months
@saurabh_shah2 The story is wild. @JeffDean just wanted to save bandwidth and chopped off the lowest 16 bits of fp32.
@JeffDean
Jeff Dean
1 year
@keveman @giffmana This is roughly right. Basically wanted to send fewer bytes over the network for our distributed neural network training system, and easiest way on a CPU was to lop off the low 16 bits of mantissa, and fill with 0s on other side. Turns out it was fine for training.
1
2
106
@surmenok
Pavel Surmenok
6 months
@jxmnop How about converting it to a (N, 2) numpy array and storing as npz (compressed)?.
1
0
95
@surmenok
Pavel Surmenok
6 months
@cremieuxrecueil What surprised me: even men from Denmark have quite high proportion of 18%.
10
0
89
@surmenok
Pavel Surmenok
7 months
@seldo Impact on me (in the valley): I wanted to order a drink to pick up in Starbucks on the way to work, the app showed a message that early order is not available. No other issues so far.
2
0
85
@surmenok
Pavel Surmenok
6 months
@theorizur Have you tried to work at startups, or at orgs that move fast (e.g. pretty much any Elon’s company)?.
7
0
80
@surmenok
Pavel Surmenok
1 year
@ID_AA_Carmack Not unlike Windows which had all kinds of patches for bugs in 3rd party apps. “On beta versions of Windows 95, SimCity wasn’t working in testing. Microsoft tracked down the bug and added specific code to Windows 95 that looks for SimCity. If it finds SimCity running, it runs the.
3
7
82
@surmenok
Pavel Surmenok
8 months
@alfred_twu Isn’t it odd not counting San Diego as west?.
6
0
73
@surmenok
Pavel Surmenok
5 months
Tweet media one
0
0
73
@surmenok
Pavel Surmenok
2 months
A large model trained on enough data learns sophisticated behaviors. The network just wants to learn, give it more compute and data.
@Tesla
Tesla
2 months
FSD Supervised 13.2 reverses to exit parking spot blocked by delivery truck, then waits for oncoming traffic to clear before proceeding. This all happens implicitly within the model, which is trained on extensive data of similar real-world scenarios.
1
4
70
@surmenok
Pavel Surmenok
1 year
@mualphaxi @Stanford It might be worth to find authors of the posters and make a clear permanent searchable record of their actions.
4
0
66
@surmenok
Pavel Surmenok
2 months
@anammostarac “I believe the most qualified person should get the job”.
1
0
61
@surmenok
Pavel Surmenok
3 months
@nearcyan You can go to any DMV office in California, doesn’t have to be SF. On the website, they used to show waiting time for people without appointment. I was able to find an office without a large queue and went there.
2
0
61
@surmenok
Pavel Surmenok
5 months
Almost every startup at YC Demo Day is building with LLMs. Huge change comparing even with the previous demo day.
6
6
57
@surmenok
Pavel Surmenok
1 month
@finbarrtimbers Kaplan et al found that (for pretraining) the learning rate schedule is irrelevant as long as LR summed up over all training steps is large enough, includes a warmup period and decay to near-vanishing value at the end.
Tweet media one
2
4
56
@surmenok
Pavel Surmenok
2 years
@stylewarning I’d start from writing tests. Then see if it’s well modularized or it’s a ball of spaghetti, attempt to refactor in the latter case.
1
0
49
@surmenok
Pavel Surmenok
11 months
It’s Monday. Time to build.
Tweet media one
4
1
52
@surmenok
Pavel Surmenok
6 months
@jxmnop Must be small integers if it takes less than 10 bytes to encode a pair in text.
3
0
49
@surmenok
Pavel Surmenok
1 month
@_apoorvnandan @karpathy Ok, actually step 1 is “become one with the data”. But verifying the loss of randomly initialized network is correct is the first thing to do before starting training.
1
1
50
@surmenok
Pavel Surmenok
3 months
@pmarca I came to work today, the parking lot is packed, engineers are working, no foosball table in sight. Occasional humanoid robots here and there.
2
2
48
@surmenok
Pavel Surmenok
1 year
In the Arena today. Trying stuff. Some will work, some won’t. Always learning.
Tweet media one
5
0
46
@surmenok
Pavel Surmenok
11 days
One step closer to large-scale unsupervised FSD.
@Tesla_AI
Tesla AI
11 days
Teslas now drive themselves from their birthplace at the factory to their designated loading dock lanes without human intervention. One step closer to large-scale unsupervised FSD
1
0
42
@surmenok
Pavel Surmenok
1 year
Now general public will learn about mighty Q-learning algorithm.
@ericjang11
Eric Jang
1 year
reading in between the lines, is Q* the fabled breakthrough in AlphaStar-style search + LLM that so many big labs are trying to get working? Many research projects in GPT-4 self-verification + search have not yielded really strong performance improvements, so I'd be quite.
0
0
40
@surmenok
Pavel Surmenok
9 months
@Noahpinion Implication is that logical thinking is a right-wing thing.
3
0
37
@surmenok
Pavel Surmenok
13 days
@Austen Wait what? Their software engineers make less money than my nanny?.
0
0
40
@surmenok
Pavel Surmenok
9 months
@juliepoptart Share officer name and badge number, public should know.
0
0
34
@surmenok
Pavel Surmenok
3 months
Saturday morning. Good time to check how my training jobs are doing. GPUs go brrrr.
4
0
39
@surmenok
Pavel Surmenok
1 year
@pronounced_kyle Log scale for y axis might help to see trends better.
2
0
37
@surmenok
Pavel Surmenok
3 months
@emollick Interesting. I don’t see it, other than less of Yann LeCun’s toxic posts lately.
3
0
37
@surmenok
Pavel Surmenok
19 days
@jxmnop Episode with Elon was far from the 3rd. Here is a full episode list starting from the first with Max Tegmark Episode with Elon is #18. Still impressive though.
0
0
35
@surmenok
Pavel Surmenok
2 years
@Tendar Может он так шифровку передает азбукой Морзе?.
0
0
32
@surmenok
Pavel Surmenok
9 months
@jmrphy Yes. It didn’t happen 2 years ago, now happens all the time. Big regression, sadly.
2
0
33
@surmenok
Pavel Surmenok
5 months
@hyhieu226 @PyTorch @MalekiSaeed Links for AsyncTP, for those who want to dig deeper.
0
4
36
@surmenok
Pavel Surmenok
10 months
That’s a lot!.
3
1
34
@surmenok
Pavel Surmenok
5 months
@GarrisonLovely Reading the article, it looks more like Hoduras became a nightmare for Honduras and wants to ruin it by walking back the deal they previously agreed on. They deserve to be bankrupted if that’s the case.
2
0
32
@surmenok
Pavel Surmenok
3 years
@Carnage4Life @BeanstalkFarms So it’s not stolen then, the protocol worked as designed. Fascinating.
0
0
29
@surmenok
Pavel Surmenok
4 months
@EugeneVinitsky Torrent.
0
0
29
@surmenok
Pavel Surmenok
1 month
Coffee from @perplexity_ai. Smart juice for a curious mind.
Tweet media one
3
0
29
@surmenok
Pavel Surmenok
4 months
@hankgreen Hard to believe it was 100%.
7
0
27
@surmenok
Pavel Surmenok
3 months
@igorsushko What’s your beef with Joe Rogan?.
17
0
29
@surmenok
Pavel Surmenok
2 years
@debarghya_das Maybe it was easier to immigrate to US back then?.
2
0
27
@surmenok
Pavel Surmenok
6 months
@nathanbenaich He refers to Noam Shazeer’s LinkedIn profile. Legend.
Tweet media one
1
1
29
@surmenok
Pavel Surmenok
8 months
@GergelyOrosz @t3dotgg @ThePrimeagen I don’t joke about bus factor. I’m very serious about bus factor.
0
0
28
@surmenok
Pavel Surmenok
11 months
@Tsla99T That’s the first thing I checked this morning! Keeping GPUs busy.
2
0
29
@surmenok
Pavel Surmenok
24 days
@VicVijayakumar 95% of the people on the call laughed.
1
0
28
@surmenok
Pavel Surmenok
2 months
@PaulSkallas Not sure why emphasizing “able bodied”. Delivery saves a ton of time.
0
0
28
@surmenok
Pavel Surmenok
3 months
@Oilfield_Rando When I was 7yo, a gypsy stole my bicycle. I still remember it.
0
0
27
@surmenok
Pavel Surmenok
12 days
Tweet media one
0
1
28
@surmenok
Pavel Surmenok
24 days
@VicVijayakumar That’s great, they still remember your name.
0
0
28
@surmenok
Pavel Surmenok
1 month
@Crypto_uWu @growing_daniel H1-B is a temporary worker visa, issued for 3 years, can be renewed for 3 more years. After 6 years they have to get out of the country (unless apply for a green card or some other visa type). They can’t bring family except a spouse and kids under 21yo.
2
1
27
@surmenok
Pavel Surmenok
1 year
@Austen Torrent is the ultimate weapon of a free man.
1
0
26
@surmenok
Pavel Surmenok
4 months
@OfficialLoganK @Wharton @emollick Congrats, Logan!.
0
0
26
@surmenok
Pavel Surmenok
9 months
@emollick Link to paper
0
3
24
@surmenok
Pavel Surmenok
2 months
@garrytan It will get even better!.
1
1
24
@surmenok
Pavel Surmenok
4 months
@yishan The best time to start was 8 years ago. The next best time to start is now.
0
0
24
@surmenok
Pavel Surmenok
2 months
@_jasonwei Will they publish the video?.
0
0
23
@surmenok
Pavel Surmenok
12 days
I had a TODO to buy more NVDA. Seeing -16% this morning felt like a Christmas gift.
2
0
22
@surmenok
Pavel Surmenok
1 year
@patio11 I’ve heard exactly the same from a barber around Thanksgiving. He also said that if he goes on vacation, his regular customers will find another barber and his business will suffer long term.
0
1
21
@surmenok
Pavel Surmenok
2 months
@swyx @JeffDean @latentspacepod Could you please share a link to the video.
0
0
23
@surmenok
Pavel Surmenok
1 year
OpenAI board members cleaned up their social media profiles: Tasha McCauley closed off Twitter, Helen Toner and Adam D'Angelo don't mention OpenAI on LinkedIn.
1
3
23
@surmenok
Pavel Surmenok
7 months
@legen_eth Someone should start an ETF following her trades.
6
0
21
@surmenok
Pavel Surmenok
8 months
@naderi_yeganeh Did you come up with these equations manually?.
3
0
21
@surmenok
Pavel Surmenok
9 months
@peterrhague Honestly I thought it’s your real photo, AI augmented. That’s odd that some people are mad about EVs. EVs are great.
8
0
22
@surmenok
Pavel Surmenok
1 month
@growing_daniel @Crypto_uWu They can’t. They cannot even immigrate in that visa, it’s a non-immigrant visa by definition.
@surmenok
Pavel Surmenok
1 month
@Crypto_uWu @growing_daniel H1-B is a temporary worker visa, issued for 3 years, can be renewed for 3 more years. After 6 years they have to get out of the country (unless apply for a green card or some other visa type). They can’t bring family except a spouse and kids under 21yo.
7
0
20
@surmenok
Pavel Surmenok
4 months
@LChoshen @DeqingFu @robinomial @jacobandreas Link to the paper on Arxiv:
1
3
21
@surmenok
Pavel Surmenok
2 months
@dkrajendra Rare elements are not rare, that’s misnomer. They are everywhere in the Earth crust. Professing these metals is not environmentally friendly (much pollution), so we outsource it whenever possible.
0
0
21
@surmenok
Pavel Surmenok
1 year
Next gen Tesla Bot. The future is already here!.Great job @_milankovac_ and the team!.
@Tesla_Optimus
Tesla Optimus
1 year
There’s a new bot in town 🤖. Check this out (until the very end)!.
0
0
21
@surmenok
Pavel Surmenok
7 years
1080Ti is still economically better than Titan V if you run CNNs.
1
7
18
@surmenok
Pavel Surmenok
1 year
@karpathy Problem with comments is that they get out of sync with code. Best code is self-documented. Comments should not explain what the code is doing, but may explain why, e.g. reasons for unconventional usage of something something as workaround for a bug somewhere.
3
0
18
@surmenok
Pavel Surmenok
1 month
@giffmana Surprised that Mark lost so much ground. Probably because no big releases in the last few months. Recency bias.
2
0
18
@surmenok
Pavel Surmenok
1 month
@twobitidiot @theallinpod @rabois Try @BG2Pod , people on the street say that it has vibes of early All-in pod. I enjoy it, information dense, no bullshit.
0
0
19
@surmenok
Pavel Surmenok
1 year
- What is Occam's razor?.- Well, the simplest explanation is that there is a guy named Occam and it is his razor.
3
0
19
@surmenok
Pavel Surmenok
6 months
One man’s prior is another man’s posterior.
1
1
18
@surmenok
Pavel Surmenok
2 years
@finbarrtimbers GPU utilization is a bad metric in practice. GPU utilization can be 100% while GPU does nothing but waiting for e.g. NCCL communication from other ranks. GPU power consumption is more informative.
3
0
17
@surmenok
Pavel Surmenok
2 months
@abacaj Locality of compute and openness of the model are orthogonal concepts. One can run Llama on a cluster.
0
0
19
@surmenok
Pavel Surmenok
1 year
I wish Google to publish a thorough postmortem to explain what went wrong with aligning their chatbot and how they are going to fix it. Curious how much of it are explicit instructions in the system prompt vs. RLHF.
3
0
18
@surmenok
Pavel Surmenok
6 months
@KareemRifai Like elections in Russia in 2011 when pro-Putin party won, and votes in one region (as displayed on TV) summed up to 146%.
1
0
18
@surmenok
Pavel Surmenok
3 months
@thegautamkamath Is this ICLR issue or one rogue reviewer?.
3
0
19
@surmenok
Pavel Surmenok
2 years
@RazRazcle Link to the paper:
1
2
19
@surmenok
Pavel Surmenok
1 year
Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.
2
0
18
@surmenok
Pavel Surmenok
2 months
Eval of LLM systems is conceptually similar to eval of other machine learning models. Look at predictions (ideally on distribution of inputs from your real users), identify patterns of errors, cluster/categorize errors, develop evals for each error cluster.
@HamelHusain
Hamel Husain
2 months
I started doing office hours on LLM evals and met with 8+ founders in the last 3 weeks. Common questions:. - Which components of our app do we start evaluating (RAG,tool calls, etc)? .- What metrics should I use?.- Where should I spend my time? . All have the same solution.
0
1
18
@surmenok
Pavel Surmenok
2 months
Thank you @chazman .V13 is 🔥.
@chazman
Chuck Cook
2 months
I don't do posts like this very often. just read it please. Since I have been home after my redeye flying all night from PHX . My @Cybertruck and Model Y had received Supervised FSD v13.2.1 while parked, over the air cellular (OTA) for free. I got in my Cybertruck dead tired.
1
1
18
@surmenok
Pavel Surmenok
6 months
@_xjdr Link to the paper:
3
1
17
@surmenok
Pavel Surmenok
10 months
@shanselman @markrussinovich Never look at desktop, always maximize windows.
1
0
16
@surmenok
Pavel Surmenok
8 months
A story about a black SFFD firefighter assaulting his Asian colleague. The department tried to cover it up, the victim was fired, the assaulter kept his job. So much dysfunction in SF public services.
@RealDianeYap
Diane Yap
8 months
Black privilege in SF: . Black firefighter looks up Asian coworker’s address, shows up at his house and tries to beat him to death with a wrench. Asian firefighter gets fired for cooperating with police. Black firefighter keeps his job, never missing a paycheck.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
16
@surmenok
Pavel Surmenok
10 months
@nikitabier I’ve owned a house for less than two years, and it’s relatively new and recently renovated, but I already have phone numbers for good repairmen for all kinds of things.
0
0
15
@surmenok
Pavel Surmenok
12 days
More goodies from DeepSeek. Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation.
@_akhaliq
AK
12 days
deepseek just dropped some new models . people are still getting used to R1. Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate
Tweet media one
3
4
17
@surmenok
Pavel Surmenok
2 months
@YunTaTsai1 Trump’s willingness to go to the long form podcasts is respectable. You can feel what kind of person he is, what points are important for him. Listen for a couple hours and you can make a better informed decision whether to hire him.
1
0
17
@surmenok
Pavel Surmenok
3 months
@sirbayes Interesting. I never heard of the other meaning of inference. Prediction seems a bit off. Prediction is about the future. For example, you can predict where a pedestrian will be 1 second from now. But detecting where they are now is not prediction. I hesitate to use the word.
2
0
16
@surmenok
Pavel Surmenok
7 months
@srush_nlp We should normalize pseudonyms and links to arbitrary webpages. Democratizing science.
0
0
16
@surmenok
Pavel Surmenok
11 months
Sometimes the model answer is wrong. Sometimes the model answer is correct but we just don’t like the result.
2
0
16
@surmenok
Pavel Surmenok
1 year
@pronounced_kyle Show me training loss going to 0.
2
0
15