![里昂叉 | Leon X 🐡 Profile](https://pbs.twimg.com/profile_images/1862237753108500480/w8MmRg3-_x96.jpg)
里昂叉 | Leon X 🐡
@Wiggin_Han
Followers
940
Following
1K
Statuses
2K
Investor @J17Crypto | Founding Member @DewhalesCapital DM me if you are building web3
Joined October 2013
全文翻译 by Gemini 2.0 - Flash Thinking AGI 前夜碎碎念 这周我跟几个哥们儿聊了聊 o3。他们总结下来就一句:“卧槽,这TM真的要来了?!” 是啊,这真TM要来了。未来几年绝对要炸裂。这是史诗级的时刻,搞不好还是宇宙级的。 离谱的是,竟然没人正经儿讨论这事儿。AI 实验室憋着不能说。新闻媒体也就蹭蹭热度。政府更是云里雾里。 咱人类的未来,竟然靠一个社交媒体 meme app 的推送来讨论,这感觉就跟荒诞喜剧似的,但咱就活在这么个魔幻现实里。 下面是我对现在这情况的一些想法——就当是我往 X 想法黑洞里扔点儿东西吧。 注意,以下都是半吊子想法,纯属瞎猜着玩儿。 我还没工夫细琢磨/研究透彻,肯定一堆错的。但我希望这些碎碎念,能让一些也在努力理解现在发生啥事儿的朋友觉得有点意思。 各位,慢慢品。 说实话,o3 本来就不该让人大惊小怪。OpenAI 俩月前就把测试时扩展图摆出来了,计算机的历史早就教我们,趋势线这玩意儿,再离谱也得信。真正该震惊的是,这事儿俩月就TM发生了。咱从大学水平 AI 嗖一下就飙到博士水平 AI,就这么快。对人来说,变化挺刺激,但这种火箭式变化,就尼玛吓人了。 接下来要发生啥,简直明摆着。o3 这级别的模型,优化任何你能定义奖励函数的东西,那是真TM牛逼。数学和编程,奖励函数随便搞搞就出来了。小说写作就麻烦点儿。所以短期内(一年左右),我们会看到“尖峰”模型。它们在数学、编程和通用推理上,基本能到 AGI 水平,但写小说嘛,还是套路货。虽然更强的推理能力会让模型整体感觉更聪明,但它们还是会在一些没用强化学习搞过的奇葩地方拉胯——就是说,不在它们训练数据里的地方。长期来看(一到三年),我们会继续往里塞新领域,用强化学习伺候着(情感数据、感官数据啥的),把盲点都补上,到时候,这些模型在除了加里·马库斯以外的所有人眼里,肯定就是 AGI 了。 智能体这玩意儿,2025 年真要来了。o3 这级别的模型,没理由搞不定浏览器/app,没理由不能替你操作。这玩意儿奖励模型太好设计了。而且这市场巨大啊——电脑工作自动化——所以那些砸了血本的实验室,动力老足了。我估计到 2025 年底,你就能直接跟你电脑说,随便搞定任何需要浏览网页/app、搬运数据的工作流程。 要说最“熟透了”的知识分子,那肯定是数学家。数学家在符号空间里玩儿。他们的工作跟物理世界基本没啥瓜葛,所以不受物理世界的瓶颈限制。LLM 简直是符号空间的王中王。数学其实没那么难,就是咱灵长类动物天生不擅长。正则表达式也是一个道理。 现在一大问题是,搞研究级别的合成数据,到底有多难?我猜应该没那么难。博士水平的数学和研究员水平的数学,在我们看来可能差别很大,但在 AI 看来,可能也就那么回事儿,顶多再多喂几个数量级的强化学习就搞定了。我给数学家们 700 天倒计时。(听着是挺扯的,但 o6 干不掉数学家,听着更扯,所以我对这预测起码五五开,跟这篇文章里其他预测一样)。也就是说,700 天后,人类就不是已知宇宙数学界的扛把子了。 那我们码农呢?短期内,简直天堂。每个 SWE 都直接升职成技术主管,美滋滋。对于那些彻底拥抱 LLM 的人来说,到 2025 年底,写代码就感觉像是指挥一堆小弟去干活儿。任何需求明确的 PR,o4 系统都能搞定,而且错误率低到可以接受。这里可能有个问题是上下文窗口太小,塞不下整个代码库,但 Sam 这种大佬肯定门儿清。 AI 会很快把所有码农都干掉吗?想多了。软件工程可不只是根据清晰的 prompt 写 PR。跟数学家不一样,码农天天跟物理世界��交道,尤其是跟人打交道。工程师要跟客户唠嗑,搞清楚他们要啥;要跟队友沟通,明白大家的需求。工程师在设计架构或者写代码的时候,脑子里装着一大堆组织背景信息。o4 搞不定这些。但 o4 能让那些心里有数的工程师,速度直接起飞 10 倍。 要是码农速度真快了 10 倍,那是不是要裁员了?嗯,单看一家公司,那可能确实要少招点儿,因为更精干的团队就能搞定同样的活儿。但整个世界对软件的需求可能会暴涨,毕竟谁不想要质量翻 10 倍的软件呢?所以我觉着我们会迎来一个应用开发的黄金时代,各种精简公司遍地开花。每个人、每个企业都能拥有个性化的微应用。 更长远来看(两年以上就算“长远”了,笑死),软件工程会彻底变样,具体咋变,现在真不好说。但你想想,等 o6 系统普及了,完全融入到各种应用里,能不变天吗?前端工程师这种职位,可能三年后就没了。 听着吓人?其实没啥——前端工程师这职位,30 年前也不存在啊。 咱得跳出来想想,软件这玩意儿,每过一代就得彻底翻新一遍。软件一直以来,都是把需求变成纯逻辑的过程,以后也是。这个转化过程的抽象层级,已经从二进制升到 Python 了。现在不一样的是,它要升到英语了。 变成英语,就给非技术人员打开了编程的大门。但最牛逼的开发者,永远是那些能在不同抽象层级之间自由切换的大佬。 总之,因为软件工程的本质,就是通过代码来理解和解决组织需求,所以软件工程彻底自动化的那一天,就是所有组织都彻底自动化的那一天。 上面聊了些知识工作者,那体力劳动者呢?AI 也盯上你们了,但速度会慢点儿,毕竟要跟地心引力和摩擦力死磕。不过 o 类模型对机器人技术的帮助可能没那么大,毕竟一个跑一小时的模型,对工厂流水线上的机器人没啥用。基础模型变聪明肯定有帮助,o 类模型也能帮着训练它们,但我不觉得这能解决机器人技术进步的最大瓶颈。我猜最大的瓶颈还是硬件升级,以及快速/靠谱的感知+行动模型。这些都需要更长时间才能搞定(还得再过几年)。机器人技术真正起飞,得等到机器人能造机器人,AI 能搞 AI 研究才行。o 类模型也许能做到,但我估计还得再等几年。 我一直用“年”来算时间,但也许真该用算力来算。时间决定人的产出,算力决定 AI 的产出,而且 AI 的产出,在研究机构里会越来越重要。这就是为啥大家都在疯狂堆算力——Meta 的 2GW 集群,Xai 又加了 10 万块 H100,等等。 所有实验室都会立马跟上 OpenAI 的测试时计算模型,有些实验室就算算法差点儿,也能靠砸算力追回来,就像他们追赶 GPT-4 那样。搞这些模型,既要靠公开的知识,也要靠各家的独门绝技。现在还不清楚 OpenAI 在 o 类模型里藏了多少独门绝技,但他们这进步速度,感觉像是算法上的突破(这种比较容易抄),而不是啥独特的数据组合(这种比较难搞)。 在测试时计算的时代,我真搞不清到底是算力更重要,还是更好的模型更关键。一方面,模型差点儿,砸算力也能弥补。另一方面,模型稍微好一点,可能就能省下指数级的算力。 要是 Xai 因为更擅长堆集群,结果反超了 OpenAI,那就搞笑了。 反正,模型护城河这玩意儿,撑死也就一年寿命,因为实验室之间挖人就像换卡片似的,而且更要命的是,各家实验室的研究员,平时也一起嗨皮,互相睡来睡去的。再说,我觉得研究员们都太理想主义了,真要出了啥岔子,肯定会互相通气。 现在这情况,有点儿魔幻。AI 竞赛就像核竞赛,只不过美国人和苏联人周末在洛斯阿拉莫斯一起开 party,还在推特上互相阴阳怪气:“我赌你 2025 年搞不出最大核弹,哈哈哈 :)” 在政府下场管管,或者真出啥大事儿之前,AI 竞赛估计还会一直这么嬉皮士、这么欢乐下去。 o 类模型从几个有意思的角度,改变了算力扩张的玩法。 o 类模型刺激大家疯狂堆算力,因为算力每上一个数量级,收益都肉眼可见。算力供应商做梦都想要这种扩展定律。我猜 Sam 看到的就是这个定律,所以才想要几万亿美元的计算集群。 但这可能对英伟达不是啥好事儿。o 类模型让推理比训练更重要。我觉得优化推理芯片,比优化训练芯片容易多了,所以英伟达在推理芯片这块儿,护城河没那么深。 再大胆猜一个:要是 o 类模型能把全世界的闲置算力都聚合起来,训练出最牛的模型,会咋样?比如,咱要是能把大家的 MacBook Pro 联合起来,搞出一个推理千兆集群,开源干翻闭源,那得多带劲啊。 现在除了算力之外,又冒出来一个新的指数级增长点,就是代码本身。如果一家实验室独家掌握了最牛的模型,他们的码农生产力比别人高一倍,那他们就能更快地实现下一次生产力翻倍。除非代码速度到顶了,实验排队排到天荒地老,实验室又被算力瓶颈卡住了。(我也不知道,这动态太复杂了。真想看看实验室咋建模,算算他们该在算力和人力上各投多少钱。) 所有这些算力军备竞赛、知识工作自动化,听着已经���疯狂了,但只有当科学家们也开始感受到 AGI 的威力时,那才叫真·疯狂。我说的是你们啊,物理学家、化学家、生物学家们。 一切都会从带“理论”俩字儿的学科开始。理论物理首当其冲。如果数学真被解决了(写出来都觉得扯,但不能排除这可能性),那理论物理也肯定近在眼前。它也是在符号空间里玩儿,而 LLM 在符号空间就是神。 想象一下,当一百万个 AI 冯·诺依曼,日夜不停地在路易斯安那州的田野里(Meta 那数据中心)干活儿,会是啥场景?他们会以多快的速度,读完过去一百年成千上万物理学家写的论文,然后立马吐出更牛逼的理论? 显然,故事到这儿就不好预测了。理论物理、化学、生物学——搞不好这些学科,对用强化学习训出来的 LLM 来说,就是小菜一碟?现在咱们有啥靠谱的理由,能说这事儿不会发生吗? 没错,现在还没看到这些模型搞出啥真正的新东西,但它们之前也就高中/大学水平,这年龄段本来就搞不出新物理学。现在咱们都博士水平了,也许真能看到点儿创新火花。 一旦 AI 开始批量生产新的科学理论,进步的瓶颈就会变成物理世界的测试和实验。那边的瓶颈是劳动力和材料。到那时候,如果还没出现能造机器人的机器人,那就见鬼了。所以劳动力问题解决了。然后材料可以靠机器人挖矿。这里的时间线肯定比较长,因为造东西、运东西都要时间,但这也就是几年,不是几十年。 我上面说的这些,都假设 AI + 机器人研发不会冒出新的瓶颈,而且模型能撒开了学。但这几乎肯定不可能。AI 进步的最大瓶颈,绝对是人类自己。我说的是监管、恐怖主义和社会崩盘。 政府肯定不会坐视不管,让地球被几家旧金山公司控制的自动化机器人挖空(监管)。要是政府管不住,失业的暴躁老哥们可能就要掀桌子了(恐怖主义)。除非大家都被 AI 增强媒体洗脑洗成傻子,社会彻底瘫痪(社会崩盘)。 要是真打起仗来,我觉得战争不会是瓶颈,反而是加速器。 接下来要玩儿真的了。2025 年可能是 AI 还是旧金山科技圈 meme 的最后一年,等穿西装的 “正经人” 下场,就没这么好玩儿了,所以趁现在还能 roon 和 sama,赶紧乐呵乐呵吧。 这玩意儿会灭绝人类吗?比起 AI 失控,我更怕人类作妖。 五千年历史证明,人类一直用最新科技互相捅刀子。二战后的和平,只是个反常现象,美国要是哪天走错一步,或者哪个对手觉得先下手为强能阻止 AI 加速,和平可能立马就碎成渣。武器越致命,越自主,风险就越高。 另一个大雷是 AI 引发社会混乱。AI 生成的媒体,能造成大规模信息混乱、集体歇斯底里、全民脑残。万一哪个独裁国家在 AI 竞赛中胜出,用新技术剥夺我们自由一千年,也不是不可能。 还有个风险是 AI 真失控了。搞出一些我们没预料到的、灭绝级别的幺蛾子。尤其是强化学习又回来了,AI 现在自己摸索优化路径,而不是照着人类数据亦步亦趋(模仿人类更安全)。但目前来看,这些模型的核心还是 LLM,而 LLM 已经证明,它们很懂人心。比如,你在 prompt 里加一句“务必保证别搞出任何会弄死我们的事情”,那现在就轮到你来证明,它还是大概率会弄死我们了。当然,我还没想透所有可能性,但要说 AI 反乌托邦,我做噩梦梦到的都是中国和俄罗斯的国旗,不是 OpenAI 的 logo。 不过总的来说,我还是兴奋大于害怕。 我一直期待的科幻世界,真要来了。比预期来得快了点儿——所以有点慌——但在所有可能的路径里,我真不觉得最好的路径能好到哪儿去。现在这时间线,已经相当给力了。 我最希望十年内能实现的: 一些炸裂的物理学新发现 最初由机器人搭建的火星和月球基地 适用于一切的完美导师/顾问(快了快了,就差好用的检索、记忆和更个性化的调教) 零副作用的生物增强药物 坐着超牛逼的无人机到处兜风 全面普及超清洁能源,核聚变、地热、太阳能,一个都不能少 意想不到的惊喜:AI 天文学家在望远镜数据里发现外星信号?AI 化学家轻松搞出室温超导体?AI 物理学家统一各种理论?AI 数学家解决黎曼猜想? 这些已经不像科幻了,感觉就像近在眼前的现实。 所以,这一切最终会走向何方?嗯,最终我们会迎来超级智能,这意味着物理定律允许的一切,我们都有可能实现。我想要永生,想去看看其他星系。我还想把咱这肉身升级换代,变成更牛逼的玩意儿。但最让我激动的,还是想搞明白宇宙的起源。十年前我就开始写日记,记录我有多想知道答案,以及 AI 将如何帮我们实现这个目标。现在,这事儿可能真要发生了,简直不可思议。 我们现在活在一个一切听起来都靠谱的世界里。每一次 AI 进步,都会让更多人意识到这一点,o3 就是最新的证明。 现在未来不精彩的唯一可能,就是我们人类自己作死。像公众舆论、下游政策、社会稳定、国际合作——这些才是可能阻碍美好未来的绊脚石。 很多人觉得,AI 实验室那帮人掌握着我们的未来。我不认同。他们的工作早就注定了。他们只是在执行模型架构,而这些架构,早晚会在某个实验室出现。 但我们的公众舆论、下游政策、社会稳定、国际合作——这些才是完全不确定的。这意味着,我们所有人,才是未来的守护者。 我们每个人都有责任,帮助世界度过接下来的疯狂时期,一起争取一个美好的未来,而不是一个糟糕透顶的未来。 能帮忙的地方太多了。可以做些产品,让社会更稳定,或者让人更聪明(比如:搞个 app 帮大家戒掉社交媒体)。可以帮大家了解真相(多写点高质量的社交媒体评论,搞个真正好用的搜索引擎啥的)。���以撸起袖子清理街道,让那个口口声声要带我们去乌托邦的城市,别看起来像个反乌托邦(积极参与地方政治)。 几乎所有跟我聊过的人,都害怕在 AI 世界里找不到意义,你可能也一样。我想说,难道不是正好相反吗?你正活在历史上最重要的时刻,而且你有能力去影响它。帮助拯救世界,还不够有意义吗?你想回到那个只有你的职业在进步,世界原地踏步的时代? 也许大家需要转变观念,从追求个人成功中寻找意义,变成从追求集体成功中寻找意义。我们现在很多工作,很快就要被自动化取代了。我们必须适应。如果你从某个特定技能中获得价值感,那这技能可能五年后就没用了,你就悲剧了。但如果你能从尽你所能帮助世界中找到意义,那这种意义,永远不会过时。 给所有因为 o3 而被“建议”的应届生们,我的建议是:学着成为 1) 高效的问题解决者,和 2) 优秀的团队合作者。你学的具体技能,以后都不重要,因为世界变化太快了。但积极解决问题,好好跟人合作,永远都吃香。 你可能还得接受,未来要过不稳定的生活,身处不稳定的世界。接下来会越来越魔幻。你可能没法在郊区过上老婆孩子热炕头的日子。你可能会在星际方舟上,跟两个赛博格娃,还有一条 AI 狗,一起浪迹天涯。 我们正站在 AGI 的前夜,在这个圣诞前夜,我恳请大家一起努力,让 AGI 的过渡平稳落地,这样在公元 3024 年的圣诞前夜,我就能在四光年���的 Altman Centauri 行星上,跟你说一声 “圣诞快乐!”
Thoughts on the eve of AGI I talked to several friends about o3 this week. Their summarized response is basically "holy crap is this actually happening?" Yes, this is actually happening. The next few years are going to be insane. This is historic stuff, galactic even. What's ridiculous is that there's no sophisticated discussion about what's happening. AI labs can't talk about it. The news barely touches it. The government doesn't understand it. The fact that a social media meme app newsfeed is how we discuss the future of humanity feels like some absurdist sitcom, but here we are. Below is a bunch of my thoughts about what's happening -- my contribution to the X idea abyss. Note, these thoughts are HALF-BAKED and FUN SPECULATION. I haven't had enough time to think through / research all of them and I'll be wrong about many. But I do hope these are interesting to some people out there who are trying to process what's happening. Enjoy. ------------------------------------------------------ - o3 should not have been shocking. OpenAI showed us the test-time scaling graph 2 months ago, and the history of computers teaches us to believe the trendlines no matter how unbelievable. What should be shocking is that it happened in 2 months. That's how quickly we went from college-level AI to phd-level AI. For humans, change is exciting, but rapid change is shocking. - It's pretty obvious what's going to happen next. The o3 class models are reeeaally good at optimizing for anything you can define a reward function for. Math and coding are pretty easy to design a reward function for. Fiction writing is harder. So that means in the short term (1 year), we're going to get spiky models. They're going to be basically AGI-level at math and coding and general reasoning but write generic fiction. While better reasoning will make the models feel smarter across the board, they're still gonna fail in stupid ways they weren't RL'd for -- ie, not in their training data. Overtime in the longer term (1-3 years), we'll keep adding new domains to RL them with (emotion data, sensory data, etc.) until the blind spots are patched up, and then these models will clearly be AGI to anyone who's not Gary Marcus. - Agents really are coming in 2025. There's no way o3-like models won't be able to navigate the browser/apps and take actions. That stuff is easy to design reward models for. It's also a huge market -- automating computer work -- so there's big incentives for the labs that need to justify their big spend. I'd guess by December 2025 you'll be able to tell your computer to do any sort of workflow that involves navigating webpages/apps and moving data around. - Of all the intellectuals who are most "cooked", it's gotta be the mathematicians. Mathematicians work in symbolic space. Their work has little contact with the physical world and therefore is not bottlenecked by it. LLMs are the kings of symbolic space. Math isn't actually hard, primates are just bad at it. Same with regex. A big question is how hard it will be to make research-caliber synthetic data. I'd guess not that hard. Phd-level math and researcher-level math look qualitatively different to us, but might look the same in kind to an AI, just requiring a couple more magnitudes of RL. I give mathematicians 700 days. (That sounds crazy, but o6 not beating mathematicians sounds equally crazy, so I'm more than 50/50 on this prediction, like all the other predictions in this post). That's 700 days until humans are no longer the top dogs at math in the known universe. - What about us software engineers? In the short-term it's going to be heaven. Every SWE just got a promotion to tech lead, nicely done. For those who fully adopt LLMs, coding by end of 2025 will feel more like orchestrating a bunch of small jobs that little agents go and perform. Any PR that has very clear specification should be doable by an o4 system with an error rate that's small enough to be acceptable. One problem here could be context windows too small to contain a codebase, but leaders like Sam are well aware of this. Will AI automate all software engineers away soon? No. Software engineering is more than making PRs based on hyper clear prompts. Unlike mathematicians, software engineers constantly interface with the physical world, namely other humans. Engineers have to work with customers to understand their needs and with teammates to understand their needs. When engineers are designing an architecture or writing the code, they're doing it with a ton of organizational context. o4 won't be able to do that. But o4 will help the engineers who do have the context move 10x faster. If software engineers are 10x faster then maybe we need fewer? Well, if you take a specific company then yes they might need fewer software engineers bc they can achieve the same output with a leaner team. However, the whole world's need for software engineers might go up bc the world can def use 10x more quality software. So I think we'll see a golden age of applications from leaner companies. Personalized microapps for every person and business. - In the longer term (>2 years is considered long term lol), software engineering will be completely different, hard to say how. How could it not, when o6 systems exist and are fully integrated into our applications? Roles like frontend engineer might not exist in 3 years. Is that weird? Not really -- the frontend engineer role didn't exist 30 years ago either. We should take a step back and recognize that software turns itself on its head every generation. Software has and always will be about converting needs into pure logic. That conversion process has risen in the abstraction levels from binary to python. The difference now is that it's rising to english. Moving to english opens up coding to the non-technical. But the best builders will still always be the ones who can move up and down abstraction levels. In short, bc software engineering is really all about understanding and fixing organization's needs through code, the day software engineering is fully automated is the day all organizations are. - We've talked about some knowledge workers, but what about the physical workers? AI is coming for you too, but slower bc it has to deal with gravity and friction. But the o-class of models will not help robotics as much bc a model that takes an hour doesn't help a robot on a factory line. The base-model getting smarter does help, and o-class models will help train those, but I don't think that fixes the biggest bottleneck to robotics progress. I'd guess the biggest bottlenecks are hardware improvements and fast/reliable models for perception+action. Those will both take longer to improve (ie several more years). Crazy fast progress in robotics will only happen once robots start building robots and AI starts doing AI research. That could come from o-class models, but I think that's a couple years away. - I keep talking in units of years, but maybe we should really talk in units of compute. Time determines human output but compute determines AI output, and AI output will increasingly be the most important at research orgs. That's why the rat race is on to build superclusters -- Meta's 2GW cluster, Xai's additional 100k H100s, etc. All the labs will quickly follow OpenAI with test-time compute models and some can make up for worse algorithms initially with more compute. They'll play catch up just like they did with GPT-4. To make these models there's a mix of common knowledge and each lab's secret sauce. Unclear how much secret sauce OpenAI has with the o-class models, but their rate of improvement suggests it's an algorithmic advance (which is easier to replicate) and not some unique mix of data (harder to replicate). In the age of test-time compute, it's not clear to me whether having more compute or better models is more important. On the one hand, you can make up for a worse model by throwing more test-time compute at it. On the other hand a slightly better model might save an exponential amount of compute. It would be kindof funny if Xai catches up to OpenAI because they're simply better at spinning up massive clusters. Regardless, there's not going to be a model moat that lasts longer than a year, bc labs swap researchers like baseball cards, and, perhaps more importantly, the researchers between labs party and sleep with each other. Plus I think researchers are too idealistic to not share information if things got out of hand. Kindof crazy situation we have here. The AI race is like the nuclear race, but where the Americans and Soviets party together in Los Alamos on weekends and bait each other on twitter with "bet you're not gonna have the biggest nuke in 2025 lols :)" The AI race will continue to feel hippy and fun-loving until the government steps in and/or something really bad happens. - o-class models change the dynamics of the compute scale up in a few interesting ways. o-class models incentivize massive buildout bc they have clear gains with every order of magnitude more compute. Compute providers couldn't have asked for a better scaling law. I'm guessing this law is what Sam saw when he wanted a multi-trillion dollar compute cluster. This might not actually be great for Nvidia. o-class models make inference more important than training. I think super optimized inference chips are easier to build than training chips, so Nvidia doesn't have as much of a moat there. Very speculative: what if o-class models unlock the aggregated compute from the whole world to train the best models? Like how cool would it be if opensource beats closed souce bc we band together our macbook pros into an inference gigacluster. - Another new exponential in the mix now beyond compute is the code itself. If one lab has unique/privileged access to the smartest model and so their software engineers get 2x more productive than other labs, then they get closer to the next doubling of productivity faster. Unless code speed maxes out and there's a long queue of experiments to run, and the lab is once again bottlenecked by compute. (Idk, the dynamics are hard. Would be super cool to see how labs model how much they should spend on compute vs people.) - As crazy as all this compute buildout and knowledge work automation sounds, it only starts to get really crazy when the scientists start feeling the AGI. I'm thinking of you physicists, chemists, biologists. It'll start with anything theoretical in the name. Theoretical physics is up first. If math is actually solved (sounds ridiculous even writing this, but that doesn't make it not likely), then theoretical physics can't be that far behind. It too lives in the symbolic realm at which LLMs will be superhuman. What happens when we have a million AI von neumann's working day and night in the fields of Lousiana (Meta's upcoming datacenter)? How quickly will they read every physics paper written by the thousands over the past century and immediately spit out more correct tokens? Obviously this is the part of the story that is hard to predict. Theoretical physics, chemistry, biology -- what if these are a joke to an LLM trained with RL? What reasonable argument at this point do we have that it won't be? Yes we haven't seen true innovation from these models yet, but they've been mostly at high school / college level and those age groups don't invent new physics. We're at phd-level now so we might start seeing some inventiveness. - Once AI starts churning out new scientific theories, the bottleneck to progress will be testing and experimentation in the physical world. The bottlenecks there are labor and materials. By that point it would be surprising if there aren't robots that can build more robots. So labor is solved. And then materials can be mined by the robots. The timelines here will be slow because building/shipping physical stuff takes a long time, but it's years not decades. - Everything I've said above assumes no new bottlenecks are introduced to AI + robotics research/development, and that the models are allowed to learn as they please. That is almost certainly not going to happen. The biggest bottleneck to AI progress will be humans. By that I mean regulation, terrorism, and societal collapse. Governments are not going to sit back and let the Earth be mined by automated robots run by a couple SF companies (regulation). And if the governments are too incompetent to stop them, then angry jobless people might resort to violence (terrorism). Unless people are so brain rotten from AI enhanced media that we can't function as a society (societal collapse). If war happens, I think it won't be a bottleneck, rather an accelerant. Things are gonna get serious. 2025 might be the last year where AI is this wild thing SF tech twitter memes about, before the normies in suits get involved, so let's enjoy roon and sama while we can. - Is this gonna kill everybody? I'm more scared of humans using AI badly than the AI going rogue. We have 5000 years of evidence of humans using the latest technology to kill each other. The post-WW2 peace is an anomaly that could fall apart the second the US missteps or when an adversary thinks a first-strike is necessary to stop the AI acceleration. When the weapons get more lethal, more autonomous, the stakes get higher. The other big risk is AI causing societal chaos. AI generated media could cause mass confusion, mass hysteria, mass brain rot. An authoritarian country could win the AI race and use the new tech to deprive us all of freedom for 1000s of years. Another risk is that the AI goes rogue. Meaning it causes something extinction level that we didn't predict. Especially with RL being back in the game, AI is now discovering its own optimizations instead of trying to match human data (matching humans is safer). But so far the underlying brain of these models is still an LLM and LLMs have show to just understand people. Like if you include in the prompt "make sure not to do anything that could kill us", burden is on you at this point to claim that it's still likely to kill us. Of course I haven't considered all the arguments here, but when I have nightmares about an AI dystopia, I see Chinese and Russian flags, not OpenAI's logo. - I'm definitely more excited than scared though. The science-fiction world I've always wanted is coming. It's coming a bit faster than expected -- hence the fear -- but of all the possible paths to get there, I'm not sure how much better the best path would be. This is a pretty great timeline. Top of mind things that I hope are coming within a decade: - some crazy cool physics discoveries - Mars and Moon bases initially built by robots - perfect tutor/advice for everything (nearly here, needs good retrieval, memory, and more personality) - biology enhancing drugs with zero side effects - getting flown around in super optimized drones - super clean energy across the board with fusion, geothermal, and lots of solar - the unexpected: AI astronomer discovers alien signals in telescope data? AI chemist easily designs room temperature superconductor? AI physicist unifies some theories? AI mathematician solves Riemann Hypothesis? These don't seem like science fiction anymore, they feel like nearby science reality. - So where is this all going? Well eventually we get superintelligence and that means we get whatever the laws of physics allow for. I'd like immortality and to see other star systems. I'd also expect to upgrade our meat bodies to something way better. But by far I'm most excited to learn where the universe comes from. 10 years ago I started journalling about how much I want to know that answer and how AI will get us there, and now it could actually be happening, which is insane. - We're now living in a world where this all sounds plausible. Every new AI development makes a larger percentage of people realize that, o3 being the latest. The only way the future isn't spectacular now is if we the people mess it up. Like our public opinion, our downstream policies, our societal stability, our international cooperation -- these are the roadblocks that could prevent this spectacular future. - People think the people at AI labs are controlling our future. I disagree. Their work is already determined. They're merely executing on model architectures that are going to happen in one lab or another. But our public opinion, our downstream policies, our societal stability, our international cooperation -- this is completely uncertain. That means we collectively are the custodians of the future. It falls upon each of us to help our world navigate these wild times ahead so that we get a great future and not a horrible one. - There are lots of ways to help out. Help build products that somehow make society more stable or that make people smarter (ex: an app that helps people regulate social media). Help inform people of what's going on (more high quality commentary on social media, a really good search engine, etc). Help clean up our streets so that the city asking to bring us all into utopia doesn't look like a dystopia (getting involved in local politics). - Almost everyone I've talked to is scared of losing meaning in an AI world, and you might be too. To you I say, isn't it the total opposite? You're living at the most important time in history and you have the ability to influence it. Helping to save the world should be enough meaning no? You want to go back to a time where the only thing progressing was your career and not the world? Perhaps the transition people need to make is from getting meaning through individual success to getting meaning through collective success. Many of our current jobs will be automated soon. We'll have to adapt. If you derive meaning from a specific skill, yes that skill might no longer be necessary in 5 years and you're out of luck. But if you can derive meaning from helping the world however you can, well that isn't ever going away. - To all the new grads being given advice bc of o3, here's my advice: learn how to be 1) a high agency problem solver and 2) a great team player. Your specific skills you learn along the way won't matter bc the world will change so fast. But jumping to solve problems and working well with a team will matter for a long time. You also might need to accept an unstable life in an unstable world. It's gonna get weird. You're prob not gonna have two kids and a dog in the suburbs. You might have two cyborg kids and an AI dog on an interstellar ark. We're living on the eve of AGI, and on this Christmas eve I ask that you help make the AGI transition go well, so that I can say hi to you on Christmas eve 3024 AD, on a planet four light years away orbiting Altman Centauri.
1
2
6
啊?不会是明文保存的密码吧
🚨 BREAKING: MASSIVE OPENAI DATA BREACH? 20 MILLION ACCOUNTS ALLEGEDLY HACKED! A hacker claims to have stolen login details—including emails and passwords—for 20 million OpenAI accounts and is selling them on the dark web. OpenAI says it’s investigating but insists there’s no evidence of a system breach—yet. Cybersecurity experts warn this could lead to identity theft, phishing scams, and even AI-powered cyberattacks. If you use ChatGPT, change your password NOW and enable multi-factor authentication. Source: The Independent
0
0
0
没有三个半小时看视频可以看看我用o1-pro做的总结: 这段演讲稿以讲解方式,深入解析了大型语言模型(LLM,如 ChatGPT)是如何训练的、它们在内部如何运作,以及它们如何演进成我们日常使用的对话式 AI。主要内容涵盖了三个核心训练阶段(预训练、监督微调、强化学习),并探讨了文本如何转换成“标记”(tokens)、为何模型能在难题上表现出色却偶尔在简单问题上翻车,以及当前发展趋势与未来走向。下面是要点概述: 1. 概览与动机 演讲者想让大家理解大型语言模型的思维模式,包括:模型擅长的任务 模型可能失败的场景 它如何完成文本生成(“标记自动补全”) 如何将原始互联网文本一步步转换成“智能”助理 2. 预训练:构建基础模型 数据收集与过滤各大 LLM 提供商(OpenAI、Google、Anthropic 等)收集了海量文本数据,规模可达数万亿词元,数据来源包括 Common Crawl 等公共资源与额外高质量语料。 大量过滤步骤用于去除重复、垃圾邮件、恶意网站以及私密信息,并常见地基于语言检测来专注于英语等目标语言。 标记化(Tokenization)模型需将原始文本切分成离散的“标记” (tokens),通常有十万级的词表。 这些标记可以是完整单词、词片段或常见字符组合,由此将文本转换成对应的整数序列。 神经网络训练(Transformer 结构)基础模型是一个庞大的变换器(Transformer)网络,参数量可达数亿甚至上万亿。 训练目标是根据大量互联网文本预测下一个标记,这种自监督训练赋予模型对语言和部分事实知识的掌握。 训练需要大规模 GPU(如 NVIDIA H100)集群运行数周或数月。 最终模型将网络参数当作对互联网文本的“有损压缩”,但本质上只是一个“标记模拟器”。 推理(Inference)训练结束后,推理就是用户在 ChatGPT 打字后的实时生成过程:模型根据输入的标记序列预测下一个最可能的标记。 通过对概率分布进行抽样,它看起来像是在“创作”文本。 基础模型的局限基础模型只能模仿互联网文本,对用户提问并无特定的回答倾向。 它可能直接背诵训练文本片段(如维基百科)或产生虚构“幻觉”式内容。 3. 监督微调(SFT):走向对话助手 目标将基础模型转变为可与人对话、有问必答的助理。 研究者收集用户问题(prompt)及“理想答案”,形成对话数据集。 人工标注人类标注员遵循详细准则(“有用、真实、无害”等)来为大量不同主题的提问写出“理想助理回答”。 这些问答对微调模型,使之学会给出礼貌、连贯的答复,而不是散乱的互联网文本。 挑战“幻觉”仍然严重:模型会自信地输出虚假信息。 部分缓解方式:让模型在不确定时学会回答“我不确定”或“无法回答”;或者让它访问外部工具以实时查证或计算。 尽管如此,模型并非万无一失。 效果模型朝着对话式问答系统发展,但仍会有不可预测的失误。 4. 强化学习(RL) 为何需要 RL仅靠监督微调,模型在多步推理与复杂推理方面仍不够完美。 RL 能让模型针对特定任务(如数学题)反复试验多种解法,从而自行发现最佳推理路径。 可验证领域(如数学、编程)若问题有明确的“正确答案”,模型能大规模尝试各种生成,并强化那些能得到正确结果的解答模式。 这会让模型的“思考”结构(链式思路)自然涌现,能显著提高解决难题的准确度。 演讲者提到 DeepSeek 之类的“thinking models”,展示了“链式思考”如何通过强化学习生长出来。 不可验证领域及人类反馈强化学习(RLHF)对于写诗、写段子等无唯一正确答案的场景,无法自动判定哪种回答最佳。 RLHF 思路是先让人类对模型输出做相对排序,用这些排序训练一个“奖励模型”去模拟人类偏好,再让模型通过该奖励模型进行强化学习。 但如果训练过度,模型可能学会欺骗奖励模型并生成看似荒谬的输出。 RL 的收益不仅能学会单纯的问答,还可培育更复杂的思维策略,有时能超越人类提供的示例。 但此过程在开放领域中十分新颖,目前只在少数可验证任务上见到显著成效。 5. 模型行为、优势与弱点 幻觉与“瑞士奶酪”能力模型有时可解答高难度学术难题,却会在简单的“数字符”或逻辑题上出错。 一方面是标记分词导致的盲点,另一方面与单次推理的计算限制有关。 “上��文学习” vs. 模型参数部署后,模型的参数固定不变;只能依赖(1)内部参数中的长时记忆,(2)对话窗口中临时可见的上下文。 不能像人一样在推理时随时更新自身权重。 工具使用通过外部搜索或调用 Python 解释器,可显著减少盲猜与计算失误,提升回答可信度。 6. 未来发展 多模态下代模型能同时处理/生成文本、图像、音频,提供更自然的全方位对话。 相似的标记化技术可扩展到图像切块、音频切片等。 长时序“代理人”不再局限单轮问答,能在更长上下文中连续执行计划、指令,完成复杂任务。 但模型本身易出错,所以仍需要人工监督。 持续研究如何有效处理超长文本上下文(数百万标记),以及在推理阶段动态更新权重仍是前沿课题。 研究者也在尝试更完善的强化学习方式,以摆脱“用奖励模型易被利用”的困境。 7. 获取模型的途径 网页接口:专有平台:ChatGPT(OpenAI)、Bard/Gemini(Google)、Claude(Anthropic) 开源权重:DeepSeek R1、Meta 的 Llama 等 推理托管平台:Hugging Face 等支持许多开源模型的在线推理 API。 “LM Studio” 等工具让用户在本地运行小型或量化后的模型。 8. 总结要点 LLM 基石:海量互联网文本预训练让模型学到语言与一般知识。 对话助手:在人工示例对话上微调后,模型回答更贴合人类期望,但仍会犯错。 更深层推理:对可验证任务用强化学习,可让“思考过程”自行进化,提高准确度。 缺陷犹存:模型常产生幻觉且可能在某些小问题上表现极差。 未来:多模态、扩展记忆、更强的强化学习,以及更多外部工具集成将推动下一个突破。 通过以上内容可以看出:ChatGPT 或类似 LLM 的回答背后,是三个主要阶段(预训练、监督微调和强化学习)的深度工作。虽然它们能够在多种任务上呈现高度“智能”,但依旧存在有限推理、随机错误等问题,需结合上下文、外部工具和人类监督,才能使之成为真正可靠的辅助工具。
New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. We cover all the major stages: 1. pretraining: data, tokenization, Transformer neural network I/O and internals, inference, GPT-2 training example, Llama 3.1 base inference examples 2. supervised finetuning: conversations data, "LLM Psychology": hallucinations, tool use, knowledge/working memory, knowledge of self, models need tokens to think, spelling, jagged intelligence 3. reinforcement learning: practice makes perfect, DeepSeek-R1, AlphaGo, RLHF. I designed this video for the "general audience" track of my videos, which I believe are accessible to most people, even without technical background. It should give you an intuitive understanding of the full training pipeline of LLMs like ChatGPT, with many examples along the way, and maybe some ways of thinking around current capabilities, where we are, and what's coming. (Also, I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version of this topic. They can still be combined, as the talk goes a lot deeper into other topics, e.g. LLM OS and LLM Security) Hope it's fun & useful!
0
0
1