daniel_duan Profile Banner
Daniel Duan Profile
Daniel Duan

@daniel_duan

Followers
3K
Following
6K
Statuses
28K

Tyranny is the deliberate removal of nuance. SwiftUI @ 

Joined April 2008
Don't wanna be here? Send us removal request.
@daniel_duan
Daniel Duan
7 days
RT @tekknolagi: Write parsers. Not too many. Mostly recursive descent. -- Michael Scott-Pollan (As seen somewhere else but I can't find i…
0
1
0
@daniel_duan
Daniel Duan
7 days
Dunning-Kruger for RL
@ID_AA_Carmack
John Carmack
7 days
Offline reinforcement learning, where an agent tries to improve a behavior policy by observing another agent without actually playing, is a harder problem than it appears. The challenge isn’t to mimic the provided play, but to learn something better than what you have seen. The difference between online (traditional) RL and offline RL is that online RL is constantly "testing" its model by taking new actions as a result of changes to the model, while the offline training can bootstrap itself off into a coherent fantasy of great returns untested by reality. It may be just an artifact of value based RL in particular, but I am inclined to believe that it is a more fundamental truth about theoretical and observational science versus experimental science, and life in general.
0
0
0
@daniel_duan
Daniel Duan
9 days
RT @javilopen: "AI is just a copy of a copy of a copy..." Sure, but:
0
305
0
@daniel_duan
Daniel Duan
10 days
This is one of the most based breakdown wrt DeepSeek R1 I’ve seen so far
0
0
0
@daniel_duan
Daniel Duan
10 days
Python stdlib including TOML support before YAML is just 🤌
0
0
0
@daniel_duan
Daniel Duan
10 days
RT @liuliu: See why Cerebras and Groq only support distilled version not the MoE version? If they still cannot put the MoE version out with…
0
1
0
@daniel_duan
Daniel Duan
12 days
RT @iScienceLuvr: Anyone who thinks DeepSeek just came out of nowhere should see this graph. For each model on this graph, weights, code,…
0
690
0
@daniel_duan
Daniel Duan
12 days
Once again, I'm involuntarily learning WAY TO MUCH about a particular language feature.
0
1
1
@daniel_duan
Daniel Duan
13 days
This is the most entertaining version of the reality. That makes it most likely, right?
0
0
0
@daniel_duan
Daniel Duan
13 days
RT @PalmerLuckey: DeepSeek is legitimately impressive, but the level of hysteria is an indictment of so many. The $5M number is bogus. It…
0
4K
0
@daniel_duan
Daniel Duan
13 days
It’s always funny when ppl equate ppl doing a thing with “China did a thing”.
@yishan
Yishan
13 days
I think the Deepseek moment is not really the Sputnik moment, but more like the Google moment. If anyone was around in ~2004, you'll know what I mean, but more on that later. I think everyone is over-rotated on this because Deepseek came out of China. Let me try to un-rotate you. Deepseek could have come out of some lab in the US Midwest. Like say some CS lab couldn't afford the latest nVidia chips and had to use older hardware, but they had a great algo and systems department, and they found a bunch of optimizations and trained a model for a few million dollars and lo, the model is roughly on par with o1. Look everyone, we found a new training method and we optimized a bunch of algorithms! Everyone is like OH WOW and starts trying the same thing. Great week for AI advancement! No need for US markets to lose a trillion in market cap. The tech world (and apparently Wall Street) is massively over-rotated on this because it came out of CHINA. I get it. After everyone has been sensitized over the H1BLM uproar, we are conditioned to think of OMG Immigrants China as some kind of Alien Other. As though the Alien-Other Chinese Researchers are doing something special that's out of reach and now China The Empire is somehow uniquely in possession of Super Efficient AI Power and the US companies can't compete. The subtext of "A New Fearsome Power Now Under The Command of the CCP" is what's driving the current sentiment, and it's not really valid. Like, no. These are guys basically working on the same problems we are in the US, and not only that, they wrote a paper about it and open-sourced their model! It is not actually some sort of tectonic geopolitical shift, it is just Some Nerds Over There saying "Hey we figured out some cool shit, here's how we did it, maybe you would like to check it out?" Sputnik showed that the Soviets could do something the US couldn't ("a new fearsome power"). They didn't subsequently publish all the technical details and half the blueprints. They only showed that it could be done. With Deepseek, if I recall correctly, a lab in Berkeley read their paper and duplicated the claimed results on a small scale within a day. That's why I say it's like the Google moment in 2004. Google filed its S-1 in 2004, and revealed to the world that they had built the largest supercomputer cluster by using distributed algorithms to network together commodity computers at the best performance-per-dollar point on the cost curve. This was in contrast to every other tech company, who at that time just bought what were essentially larger and larger mainframes, always at the most expensive leading edge of the cost curve. (To the young people reading this, this will sound incredible to you) I worked at PayPal at the time, and in order to keep pace with the rising transaction volume, the company was forced to buy bigger and bigger database servers from Oracle. We were totally Oracle's bitch. At one point when we ran into scalability issues, the Oracle reps told us we were their biggest installation so they had no other reference point on how to help us overcome our scalability issues. We literally resorted to flipping random config switches and rebooting it. (This heavily influenced me when I was a young manager later at Facebook. I deliberately torpedoed an Oracle salesman's pitch to try and get us to switch from open source MySQL databases to an Oracle contract: of course we had scalability problems, but at least when we had them, we could open up the hood and figure out how to fix it ... assuming we had good enough engineers, and we did. When it's closed-source infra, you're at the mercy of the vendor's support engineers) Back to Google - in their S-1, they described how they were able to leapfrog the scalability limits of mainframes and had been (for years!) running a far more massive networked supercomputer comprised of thousands of commodity machines at the optimal performance-per-dollar price point - i.e. not the more expensive leading edge - all knit together by fault-tolerant distributed algorithms written in-house. Some time later, Google published their MapReduce and BigTable papers, describing the algorithms they'd used to manage and control this massively more cost-effective and powerful supercomputer. Deepseek is MUCH more like the Google moment, because Google essentially described what it did and told everyone else how they could do it too. In Google's case, a fair bit of time elapsed between when they revealed to the world what they were doing and when they published a papers showing everyone how to do it. Deepseek, in contrast, published their paper alongside the model release. Now, I've also written about how I think this is also a demonstration of Deepseek's trajectory, but that's also no different from Google in ~2004 revealing what it was capable of. Competitors will still need to gear up and DO the thing, but they've moved the field forward. But it's not like Sputnik where the Soviets have developed technology unreachable to the US, it's more like Google saying, "Hey, we did this cool thing, here's how we did it." There is no reason to think nVidia and OAI and Meta and Microsoft and Google et al are dead. Sure, Deepseek is a new and formidable upstart, but doesn't that happen every week in the world of AI? I am sure that Sam and Zuck, backed by the power of Satya, can figure something out. Everyone is going to duplicate this feat in a few months and everything just got cheaper. The only real consequence is that AI utopia/doom is now closer than ever. ==== Bonus: This is also a little similar the Ethereum PoS moment, when AI finally has a counterpoint to the environmentalists who say AI uses so much electricity. We just brought down the cost of inference by 97%!
0
0
0
@daniel_duan
Daniel Duan
14 days
RT @bradtgmurray: Today Google open sourced PebbleOS and it makes me incredibly happy. That codebase and that team I still have so much pri…
0
415
0
@daniel_duan
Daniel Duan
14 days
Is this how AI is going to take our jerbs?
@davidtolnay
David Tolnay
14 days
Not long ago, I used to have a more optimistic impression of Rust users. I would not have guessed that so many otherwise-judicious people would go for blatantly AI-"maintained" Rust libraries. The `serde_yml` crate is a fork of a high-quality but unmaintained library. In the fork, the AI has taken initiative to add a big heap of stuff that is variously complete nonsense ( or unsound (. On top of this, the crate's documentation has been broken in docs·rs for the last 5 months because AI hallucinated a nonexistent rustdoc flag into the crate's configuration. And yet 134 other published packages have chosen to adopt this? Including high-profile competently maintained projects like Jiff (for tests only), axodotdev, Wasmer, MiniJinja, and Holochain. This does not bode well. The bar for someone to do better at a YAML library is so low.
Tweet media one
0
0
2
@daniel_duan
Daniel Duan
14 days
RT @davidtolnay: Not long ago, I used to have a more optimistic impression of Rust users. I would not have guessed that so many otherwise-j…
0
131
0
@daniel_duan
Daniel Duan
14 days
A killer trust and safety team, one might call it.
@daniel_duan
Daniel Duan
14 days
The “innovation” in DeepSeek R1 no one seems to talk about: its team figured out how to censor it **in the weights**.
0
0
0
@daniel_duan
Daniel Duan
14 days
The “innovation” in DeepSeek R1 no one seems to talk about: its team figured out how to censor it **in the weights**.
0
0
1
@daniel_duan
Daniel Duan
14 days
RT @headinthebox: I am watching the DeepSeek R1 circus with much amusement. It is not even funny how obvious it is that smarter software…
0
55
0
@daniel_duan
Daniel Duan
14 days
This was a fun game.
Tweet media one
Tweet media two
0
0
1