Among my first reactions upon hearing "artificial superintelligence" were "I can finally get answers to my favorite philosophical problems" followed by "How do I make sure the ASI actually answers them correctly?"
Anyone else reacted like this?
@janleike
You assume that you don't need to solve hard philosophical problems. But the superhuman researcher model probably will need to, right? Seems like a very difficult instance of weak-to-strong generalization, and I'm not sure how you would know whether you've successfully solved it.
I'm relatively new to participating on Twitter/X, so if anyone has tips/suggestions/critiques on how/what I'm doing, please let me know. (Just had a person block me for the first time, and I'm not sure if I did something wrong, or if it's par for the course.)
"Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return. Since our research is free from financial obligations, we can better focus on a positive human impact."
Has Elon Musk not heard of the Median Voter Theorem? It implies that the US is unlikely to become a one-party states this way because the parties will adjust their platforms to evenly divide up the political spectrum. It's why we have low-margin swing states in the first place.
Very few Americans realize that, if Trump is NOT elected, this will be the last election. Far from being a threat to democracy, he is the only way to save it!
Let me explain: if even 1 in 20 illegals become citizens per year, something that the Democrats are expediting as fast
The people try to align the board. The board tries to align the CEO. The CEO tries to align the managers. The managers try to align the employees. The employees try to align the contractors. The contractors sneak the work off to the AI. The AI tries to align the AI.
OpenAI’s creators hired Sam Altman, an extremely intelligent autonomous agent, to execute their vision of x-risk conscious AGI development for the benefit of all humanity but it turned out to be impossible to control him or ensure he’d stay durably aligned to those goals. 🤔
Don't forget Daniel Kokotajlo
@DKokotajlo67142
, who started this whole chain of events by being the first OpenAI employee to refuse to sign the non-disparagement agreement, at enormous financial cost.
This is speaking my language, but I worry that AI may inherently disfavor defense (in at least one area), decentralization, and democracy, and may differentially accelerate wrong intellectual fields, and humans pushing against that may not be enough. Some explanations below.
New monster post: my own current perspective on the recent debates around techno-optimism, AI risks, and ways to avoid extreme centralization in the 21st century.
Many people seem to care more about social status than either leisure or having descendants. This surprising fact (caring so much about positional/zero-sum goods) may have important implications for the nature of human values (e.g., the debates over egoism vs altruism and moral
Sometimes I despair for humanity. This is someone who understands quantum computing and cryptocurrency, yet this is how he thinks about the AI transition. How on Earth do we get through any crisis or opportunity that requires deep strategic thinking?
For about a decade, I was one of maybe 5 people on Earth who had thought most about decentralized cryptocurrency (although the word wasn't invented yet). I don't think any of us were optimistic enough about it, despite selection effects. Really hope AI safety doesn't repeat this.
Most philosophers of religion believe in god, and most philosophers of aesthetics believe beauty is objective.
The lesson: strong selection effects sometimes lead to experts in a field having worse opinions than average people.
Increasingly true for “misinformation experts”.
I'm actually not sure that EA has done more good than harm so far. The costs of its mistakes with FTX and OpenAI are just really high, and not fully accounted for. I would say that its mistakes with
@OpenAI
were not only in the recent episode, but also earlier, when EA 1/2
I wonder if there's a world in which Eliezer Yudkowsky became convinced earlier in his career that trying to build AGI is a bad idea, and what that world looks like now. (I'm afraid that it wouldn't actually look very different.) I tried to convince him a few times, but failed.
People in the replies: "Wrong. Person X influenced millions of retards. Yudkowsky only influenced a handful of CEOs running the most important companies in the world."
I once checked out an econ textbook from the school library and couldn't stop reading it because the insights gave me such a high. Imagine what our politicians would be like if that was the median voter. (Which is doable with foreseeable tech, e.g., embryo selection!)
Sometimes you just have to ignore the signaling function of prices, supply and demand, trade-offs, comparative advantage, the failures of economic planning, the workings of self-interest, the importance of incentives, etc.
#TheAgeOfEconomicIgnorance
Me: Humanity should intensively study all approaches to the Singularity before pushing forward, to make sure we don't mess up this once in a lightcone opportunity. Ideally we'd spend a significant % of GWP on this.
Others: Even $50 million per year is too much.
Half a billion dollars, donated to effective philanthropies like Givewell, could have saved 100,000 lives. Instead it underwrote ingenious worries like, '"If an AI was tasked with eliminating cancer, what if it deduced that exterminating humanity was the most efficient way, and
Watching OpenAI employees wave their little red hearts like people used to wave little red books, and their leader leveraging that to come back to power was a huge negative update for me. Didn't think that level of groupthink was likely at a place like OpenAI
I've been wondering why I seem to be the only person arguing that AI x-safety requires solving hard philosophical problems that we're not likely to solve in time. Where are the professional philosophers?! Well I just got some news on this from
@simondgoldstein
: "Many of the
Reading A Fire Upon the Deep was literally life-changing for me. How many Everett branches had someone like Vernor Vinge to draw people attention to the possibility of a technological Singularity with such skillful writing, and to exhort us, at such an early date, to (1/3)
Re: persistent speculation that I'm Satoshi Nakamoto, this timeline of my publications (thanks to
@riceissa
for collecting them!) shows how much my interests had shifted away from digital money by the time Bitcoin was released.
16-3 Hal Finney is covering for Wei Dai
By WeishaZhu / March 11, 2023
Wu: Hal Finney is Watson of Satoshi Nakamoto, one of the great founders of Bitcoin, who proposed the Hal Finney conjecture. He died at his home in Arizona in August 2014 at 58.
I spoke highly of Hal Finney in
Lastly, I worry that AI will slow down progress in philosophy/wisdom relative to science and technology, because we have easy access to ground truths in the latter fields, which we can use to train AI, but not the former, making it harder to deal with new social/ethical problems
I worry about the AI transition going wrong this way more than any other, partly because almost nobody else seems to be working on or thinking about it, even as it's coming true, with current AIs already accelerating math, coding, and biology while having little effect on
Lastly, I worry that AI will slow down progress in philosophy/wisdom relative to science and technology, because we have easy access to ground truths in the latter fields, which we can use to train AI, but not the former, making it harder to deal with new social/ethical problems
I was thinking about writing an AI Alignment Forum post titled "Top signs your alignment work is being exploited for safety-washing" but somehow that feels less urgent now.
I think
@HoldenKarnofsky
(and EA as a whole to some extent) deserve negative credit for their role in OpenAI, including defending OAI with "I don’t know whether OpenAI uses nondisparagement agreements; I haven’t signed one." in 2022 instead of investigating the allegation.
I tried to DM this person but didn't get a reply. Could someone help me reach out and let him know that I've been thinking about AI x-risk since the 90s, even before crypto, and have been active on LW since its beginning, see if that changes his mind about "p(doom) cult"?
If you're not clued in to LW/EA, then you wouldn't know that there are THOUSANDS of people thinking like this - every moment of their day. The p(doom) cult dominates their every thought.
We need to understand what this is and build more bridges for them to escape & heal.
I can only find 2 mention of the Median Voter Theorem on Twitter outside this thread in the last 24 hours, one also in connection with Musk's tweet, with only 172 views, and the other celebrating Harold Hotelling's birthday! How to explain this, given Musk's 49.5M views?
Has Elon Musk not heard of the Median Voter Theorem? It implies that the US is unlikely to become a one-party states this way because the parties will adjust their platforms to evenly divide up the political spectrum. It's why we have low-margin swing states in the first place.
The current and previous moral systems of the Western world (Christianity and wokeism) are both based on a foundation of moral and empirical falsehoods (or strong overconfidence) propped up by taboos enforced by social and physical punishments.
Unfortunately, "Everyone else is incompetent/evil, I have to win the AGI race and save the world" is turning out to be such a strong attractor for a certain group of humans, that there may be nothing that Eliezer or anyone could have done or avoided doing to avert what's
I actually disagree with OpenAI's original strategy as well, and even MIRI's strategy when they were trying to build a Friendly AI. "The only winning move is not to play."
In other words, everyone should have been trying to build an international consensus against building AGI.
"Because of AI's surprising history, it's hard to predict when human-level AI might come within reach. When it does, it'll be important to have a leading research institution which can prioritize a good outcome for all over its own self-interest."
@jessi_cata
My view is that given the small size of the Agent Foundations team, the amount of time they were given, and my priors for philosophical difficulty, it would be quite surprising if they did succeed. I wrote something along these lines in a 2013 post
think about how to approach it strategically on a societal level or affect it positively on an individual level. Alas, the world has largely squandered the opportunity he gave us, and is rapidly approaching the Singularity with little forethought or preparation. (2/3)
A major source of my pessimism is that genuine altruism is rare compared to virtue signaling / status gaming and there's also a lack of alignment between people's status games and long-run outcome for human civilization. I.e. what's best for one's status usually isn't best for
It seems like the most common criticism of AI x-risk is something like “What you are saying is not immediately apparent to me, and since you are low status, I don’t need to devote more thought to it”. Hard to know how AI safety researchers can win this status game?
The current outcome (safety concerns and safety-focused structures not holding up under competitive/commercial pressures) seems pretty foreseeable. How did all those smart people miss it? Did the prospect of glory go to their heads?
@balajis
I can understand most of these but why "Freedom to compute — AI"? If you're not worried about AI risk, please check out this post of mine. There may be more reasons to be worried than you thought.
"another way for AGIs to greatly reduce coordination costs in an economy is by having each AGI or copies of each AGI profitably take over much larger chunks of the economy"
backed Sam Altman
@sama
with its money and credibility (perhaps due to insufficient DD), causing many in the EA and AI safety communities to refrain from scrutinizing or criticizing OpenAI earlier. See for example Holden Karnofsky's silence here 2/2
It seems a terrible idea (aside from the technical problems) to align AGI/ASI to human values while human values are such a mess - shouldn't we get our house in order first, or alternatively figure out how to make aligned AI safe despite our deep flaws? It's very tempting to view
We absolutely need whistleblowers, but it has to be combined with other ways of changing AI company culture, or they'll start filtering out safety-conscious potential hires for fear of them becoming whistleblowers. Zach Stein-Perlman’s AI Lab Watch seems promising start.
Altman: We think the best way AI can develop is if it’s about individual empowerment and making humans better, and made freely available to everyone, not a single entity that is a million times more powerful than any human. Because we are not a for-profit company, like a
For example, if AGI was developed (and assuming AI alignment succeeds, or unaligned AIs remain peaceful), how does that affect humans' status games / social interactions? In the longer run, will we endorse our desires for positional goods, and if so could that zero-sum
"There is an apparent asymmetry between attack and defense in this arena, because manipulating a human is a straightforward optimization problem [...] but teaching or programming an AI to help defend against such manipulation seems much harder [...]"
The main uncertainty I have is whether starting or laying the groundwork for a stop/pause AI movement ~decade earlier could have made some difference to the political landscape today.
Would Satoshi be spending his time in 2009 writing about decision theory and the minutia of optimizing AES in assembly?
I also wrote a post some time ago about why I became less interested in cryptography and security in general:
@ohabryka
@NeelNanda5
It also seems incomprehensible that OpenPhil (or any of the people involved in making the grant to OpenAI, AFAIK) hasn't made any statement about OpenAI's plans to become fully for-profit.
Striking how the top three cited AI scientists share a strong opinion that this tech could kill every living thing on earth. Arguably all three made the decision to prioritise safety over money.
Hinton quit his job at Google so he could speak freely about the risks.
Bengio is
OpenAI relies a lot on external capital/support, and recent events hopefully mean that others will now trust it less so it's less likely to remain a leader in the AI space. What is x-safety culture like at other major AI labs, especially Google DeepMind?
@davidmanheim
@LinchZhang
@rruana
@benlandautaylor
@elonmusk
Nick Bostrom Aug 5, 1998: The big question is: what can be done to maximize the chances that the arrival of superintelligences will benefit humans rather than harm them? The range of expertise needed to address this question extend far beyond that of computer scientists and
@danfaggella
Why is it not part of good governance to view people like Altman as "sociopaths", to morally condemn them, lower their social status, etc? This is perhaps the most basic form of governance that humans have, and I see no reason to abandon it instead of adding to it.
@bensig
@adam3us
Crypto++ had the fastest SHA-256 code at the time. It's documented in one of
@hashbreaker
djb's papers . I think I found an optimization that shaved off 1 instruction from critical path of the inner loop, or something like that.
If anyone is interested in reading more on this, here's a post I wrote on this topic, which also links to other related posts I've written over the years.
The only way out is through. Redpill yourself over and over, until you get bored of it, until the shine wears off. Oh, a yet another transformational model of the world that causes a set of previously mysterious things to snap into place and purports to Explain Things. Nice.
@theorizur
@simondgoldstein
It depends on the specific alignment approach, but I think metaethics and metaphilosophy are needed for most. Example problems: 1. Can the user be wrong about what they want or value, and if so what to do about that? 2. If the AI thinks of or receives a philosophical argument
We somehow managed to weaken the taboos around, e.g., disbelief in the Christian God, which over time made Christianity less popular, only to see the same pattern recur with wokeism. I think this and similar phenomena around the world show that human morality is deeply flawed.
@NathanpmYoung
@HoldenKarnofsky
And the failures continue: not reflecting on past mistakes, not writing down "lessons learned" to help others (e.g. other board members of OpenAI and similar orgs), forfeiting the PR battle to SamA who might well have won except for the heroics of Daniel Kokotajlo and
@janleike
@NathanpmYoung
@HoldenKarnofsky
I'm thinking of all of the mistakes that led to the present moment, including conflict of interest in the initial grant, DD failure in allowing SamA to become CEO, mishandling the firing and letting Sam further consolidate power, lending EA's credibility without enough oversight.
@ESYudkowsky
At the time I was like "Nobody has talked about the Median Voter Theorem yet. Free alpha for me." Little did I know that 12 hours later I would still be virtually the only person to have mentioned it. Where are the professional political scientists and economists???
High population may actually be a problem, because it allows the AI transition to occur at low average human intelligence, hampering its governance. Low fertility/population would force humans to increase average intelligence before creating our successor, perhaps a good thing!
Human attempts at altruism often turn out not just badly, but catastrophically. OpenAI seems on track to becoming another example of this. Contrast "Our mission is to ensure that artificial general intelligence benefits all of humanity." with what they do:
Leopold Aschenbrenner told his side of the story for why OpenAI fired him. If accurate, this is wild! Overall, it sounds like he was targeted for being a squeaky wheel (not signing the SamA letter, raising security issues w/ the board, talking about AGI being a govt project...
@BartenOtto
1. Not wanting to die is one value among many. Some really want to experience a post-Singularity future, for example. How to trade off between them is a philosophical question.
2. Lots of epistemological questions about how to think about x-risks. (Anthropic reasoning, etc.)
3.
Are there *any* Pareto improvements in the real world, after taking reallocation of power, and other forms of status, into account? Every proposal is a bid for power. Every argument or statement of fact is a bid for prestige. Every success makes everyone else less successful by
What I call the “consequentialist two-step” for political naifs:
1. Advocate for a policy which seems like a Pareto improvement, but is actually a reallocation of power when accounting for realpolitik.
2. Be surprised/outraged when others treat that like a move in a power game.
I'm worried about AI differentially accelerating STEM vs philosophy, but the same risk perhaps exists with human cultures/systems. Where is the corresponding crop of highly capable philosophers being produced in the People's Republic of China, for example?
See earlier thread for STEM talent analysis.
PRC quality-adjusted human capital pool will INCREASE over next ~20 years. Growth in highly able STEM talent in PRC ~ Rest of World combined!
The demographic story you read about in the media is misleading.
I fantasize about a movement like EA, but built on a foundation of explicit understanding and discussion of status, and seeing its mission as aligning its status games with altruism to better leverage people's status motivations for good. But is this actually feasible?
@stuhlmueller
Why aren't professional economists and philosophers debating these ideas? I used to think that AGI was just too far away, and these topics will naturally enter the public consciousness as it got closer, but that doesn't seem to be happening nearly fast enough. What gives?
@bensig
@adam3us
@hashbreaker
Note that the OpenSSL acknowledgement is under "Legal". Crypto++'s license did not require acknowledgement for including individual files like the SHA-256 code. (It was only copyrighted as a compilation and individual files were put in the public domain.)
@teortaxesTex
I think a large part of it has to be motivated reasoning, wanting to be the hero that saves the world, just like the founders of OpenAI. "Because we are not a for-profit company, like a Google, we can focus not on trying to enrich our shareholders, but what we believe is the
@janleike
You assume that you don't need to solve hard philosophical problems. But the superhuman researcher model probably will need to, right? Seems like a very difficult instance of weak-to-strong generalization, and I'm not sure how you would know whether you've successfully solved it.
How many of these do you recognize?
metaphilosophy for AI safety
beyond astronomical waste
UDT
UDASSA
human-AI safety
logical risk
acausal threat/trade
modal immortality
internal moral trade
ontological crisis in human values
AGI merger/cooperation/economy of scale
29 September is the birthday of Harold Hotelling, economist and statistician, whose principle of minimum differentiation was influential in the discovery of the median voter theorem and had a big influence on political science.
@KelseyTuoc
"as we said"
Unbelievable that he straightforwardly lies like this and gets away with it. It's incredibly frustrating watching humanity take the AI transition so unseriously, e.g. letting obviously untrustworthy people lead the effort.
@MatthewJBar
If an AI commits a crime (say it kills someone), do you also punish its copies/derivatives, if so which ones? What if it trains or programs a new AI then deletes itself?
@gcolbourn
Perhaps worth noting that Dario Amodei and Ilya Sutskever also started for-profit companies after leaving OpenAI, so part of it is just that the non-profit structure was never going to last or be competitive.
@RokoMijic
I think there are many disjunctive ways for the AI transition to go wrong (in a way that destroys most value of the lightcone). Many of these have approximately 0 people trying to prevent them, and some seem too hard for current humans to solve. Related post:
@amcdonk
Seems to be false for South Korea, at least. For example, living in the capital is a universal status symbol there, causing massive brain-drain from the rest of the country and contributing to fertility collapse.
@stuhlmueller
Thanks for the signal boost! It seems wild that with AGI plausibly just years away, there are still important and fairly obvious considerations for the AI transition that are rarely discussed. Another example is the need to solve metaphilosophy or AI philosophical competence.
On the one hand, we don't seem to be on a path to a positive Singularity. On the other hand, at least we don't live in a much bigger universe. But what if what we see here gives info about what is happening in those much bigger universes?😢
Exclusive: OpenAI publicly committed to give 20% of its computing resources to a team dedicated to controlling the most dangerous kind of AI. It never delivered, and, in fact, repeatedly denied that team's requests for resources, sources say.
@benchanceyy
I'm pretty uncertain about it. One draft of my tweet expressed the possibility that he was just saying it for political effect, but I ended up deleting that part to focus more on the political science as opposed to the politics.
@BogdanIonutCir2
@Gabe_cc
I think there's a high risk that automated research will speed up capabilities more than alignment, because alignment research involves philosophical questions, and may become bottlenecked by lack of AI philosophical competence, which ~nobody is working on. Curious if you've seen
humanity. Worse, explicitly thinking about status isn't conducive to gaining status in most local games, leading to almost nobody thinking about and trying to solve these problems. Even EAs, who are among the most reflective people on Earth, don't seem to like talking about this.
@janleike
@AnthropicAI
Please push them to improve their governance, safety culture, and transparency, to avoid a repeat of OpenAI. I'm seeing some worrying signs, , for example.
@ohabryka
@NeelNanda5
I haven't been aware of the Dustin angle. Why would he be hurt if someone speaks out against OpenAI, or why would he not like that?
@RichardMCNgo
There's public choice theory, which I'm a fan of, but the understandings I get from it (e.g., median voter theorem, rational ignorance, capture by special interests) make me feel averse about and want to stay away from politics. Do you have some other kind of thing in mind?
@ESRogs
I wonder if community notes give a false sense of security about misinformation, if something like this gets tens of millions of views and goes unchecked.
@RokoMijic
How can Americans know the (instrumental) value of free speech, unless some otherwise similar countries occasionally experiment with giving it up? It's value could sign flip due to tech or other change, and unfortunately we don't have the tools to derive it from first principles.
@perrymetzger
I've connected with Bryan. Apparently Twitter hid my DM from him.
I'm more active on Twitter in part to find out why more people aren't concerned. Feel free to point me to anything that explains your thinking.
@RichardMCNgo
@ilex_ulmus
personally don't respond. I'm not doing it myself here, in part because I'm probably less sure than Holly about the object level ethics and I value intellectual discussions with you, so this is more of an intellectual commentary.
@RichardMCNgo
@ilex_ulmus
I think Holly is probably right on the object level (working for OpenAI is unethical) and the game theory (most people respond to social pressure so why believe that you are different). It also makes sense in terms of disincentivizing others from working at OpenAI, even if you
@webmasterdave
@AutismCapital
Quoting myself: A person who consciously or subconsciously cares a lot about social status will not optimize strictly for doing good, but also for appearing to do good. One way these two motivations diverge is in how to manage risks, especially risks of causing highly negative
Many people seem to care more about social status than either leisure or having descendants. This surprising fact (caring so much about positional/zero-sum goods) may have important implications for the nature of human values (e.g., the debates over egoism vs altruism and moral