![dylan🌲 (a/gpu) Profile](https://pbs.twimg.com/profile_images/1554504504283869184/BQm52Bu0_x96.jpg)
dylan🌲 (a/gpu)
@dylanmrose
Followers
3K
Following
17K
Statuses
8K
Building the next big thing in computers @evergreenminer @evgcompute
United States
Joined September 2016
RT @strawpari: life update: i work here now ¨̮ i'm super stoked to launch our new brand today with the sickest domain name i've ever dev…
0
6
0
RT @SlowestTimelord: In case anyone was confused. The rebrand was from Berkeley Compute to Silicon.
0
11
0
RT @mouthy: Excited to share glimpse of our new branding as Silicon and upcoming website refresh ahead of our first GPU NFT sale at https:/…
0
11
0
I got dylan@silicon.net now and it's so baller. We've been building so much HAHAHA
We're excited to announce our rebrand to Silicon. This marks a major milestone in our mission to transform GPU compute into a new financial asset class.
0
0
23
RT @silicondotnet: We're excited to announce our rebrand to Silicon. This marks a major milestone in our mission to transform GPU compute i…
0
21
0
This is the super bowl of taking meth and cleaning you entire house
It actually makes perfect sense to have a team of cracked zoomers on DOGE. You want people who look at a system and say “this sucks, let’s start from scratch” - not “how can I work in this bureaucracy to further my career?” Can you imagine if it were Google employees instead?
1
0
5
RT @kylejohnmorris: I've joined Compute Exchange as Head of Growth to make GPUs easier to buy and sell! I want to make AI actually-Open,…
0
2
0
There's a ton of hysteria rn, largely overblown. Great progress but should understand the full context before getting spun out
There is a ton of misinformation about DeepSeek r1 model. Tl;dr it really is as impressive and powerful as the hype suggests, but: a.) You basically NEED to use chain of thought to use it properly, it's generally the norm that you can use suboptimal technique for a model and still get pretty good results, with this one you need to play by its rules. b.) There's a lot of chatter about how "you can run R1 with print(hilariously_low_amount_of_VRAM)" but what people are referring to is running one of the distills, and this isn't really helped by packages like ollama calling it "deepseek r1 7b" and then you drill down into the docs and see it's qwen 7b with r1's reasoning embedded. You aren't running R1 itself without 300+ GB of VRAM at the lowest available quants. Distills are a downstream benefit, but running a distill isn't running r1. c.) I don't think anyone is going to beat deepseek's API price - not us, not our competitors. It's just not possible. It's orders of magnitude below what you'd pay on OpenRouter on a token cost per model parameter basis. You can get 1m tokens on openrouter on a 7b model for $0.10, or you can get 1m tokens on deepseek's API for this 671b model for $0.14. I think this API is just being run as a huge loss leader to pump hype about the model, but regardless, it exists and it's very hard to compete with. It's more feasible for us to show that you can run the distills on RunPod (which are perfect to run in serverless) or perhaps a lower bit quant of R1 in a pod. Long story short: incredible model, deserving of the hype, but also has the most complicated meta-environment in an open source LLM we've seen to date. Brendan from our team wrote a great piece about this - posting link below
0
1
3
RT @runpod_io: There is a ton of misinformation about DeepSeek r1 model. Tl;dr it really is as impressive and powerful as the hype suggests…
0
12
0