"For AGI and superintelligence (we refrain from imposing precise definitions of these terms, as the considerations in this paper don't depend on exactly how the distinction is drawn)" Hmm, is that true? His models actually depend quite heavily on what the AI can do, "can reduce mortality to 20yo levels (yielding ~1,400-year life expectancy), cure all diseases, develop rejuvenation therapies, dramatically raise quality of life, etc. Those assumptions do a huge amount of work in driving the results. If "AGI" meant something much less capable, like systems that are transformatively useful economically but can't solve aging within a relevant timeframe- the whole ides shifts substantially, surly the upside shrinks and the case for tolerating high catastrophe risk weakens?
The earliest bits of the paper cover the case for significantly smaller life expectancy improvements. Given the portion of people in the third world who live incredibly short lives for primarily economic (and not biological) reasons it seems plausible that a similar calculus would hold even without massive life extension improvements.
I'm bullish on the ai aging case though, regenerative medicine has a massive manpower issue, so even sub-ASI robotic labwork should be able to appreciably move the needle.
That is the thing about these conversations, is that the issue is potentiality. It comes back to Amara's Law; “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” Its the same thing with nuclear energy in the 1950s about what could be without realizing that those potentials are not possible due to the limitations of the technology, and not stepping into the limitations realistically, that hampers the growth, and thus development, in the long term.
Sadly, there is way, way, way too much money in AGI, and the promise of AGI, for people to actually take a step back and understand the implications of what they are doing in the short, medium, or long term.
I guess argument seems to be that any AI capable of eliminating all of humanity would necessarily be intelligent enough to cure all diseases. This appears plausible to me because achieving total human extinction is extraordinarily difficult. Even engineered bioweapons would likely leave some people immune by chance, and even a full-scale nuclear exchange would leave survivors in bunkers or remote areas
Humans have driven innumerable species to extinction without even really trying, they were just in the way of something else we wanted. I can pretty easily think of a number of ways an AI with a lot of resources at its disposal could wipe out humanity with current technology. Honestly we require quite a bit of food and water daily, can't hibernate/go dormant, and are fairly large and easy to detect. Beyond that, very few living people still know truly how to live off the land. We generally require very long supply chains for survival.
I don't see why being able to do this would necessitate being able to cure all diseases or a comparable good outcome.
When you put it that way, it sounds much easier to wipe out ~90% of humanity than to cure all diseases. This could create a "valley of doom" where the downsides of AI exceed the upsides.
These narratives are so strange to me. It's not at all obvious why the arrival of AGI leads to human extinction or increasing our lifespan by thousands of years. Still, I like this line of thinking from this paper better than the doomer take.
I'm not saying I think either scenario is inevitable or likely or even worth considering, but it's a paperclip maximizer argument. (Most of these steps are massive leaps of logic that I personally am not willing to take on face value, I'm just presenting what I believe the argument to be.)
1. We build a superintelligence.
2. We encounter an inner alignment problem: The super intelligence was not only trained by an optimizer, but is itself an optimizer. Optimizers are pretty general problem solvers and our goal is to create a general problem solver, so this is more likely than it might seem at first blush.
3. Optimizers tend to take free variables to extremes.
4. The superintelligence "breaks containment" and is able to improve itself, mine and refine it's own raw materials, manufacture it's own hardware, produce it's own energy, generally becomes an economy unto itself.
5. The entire biosphere becomes a free variable (us included). We are no longer functionally necessary for the superintelligence to exist and so it can accomplish it's goals independent of what happens to us.
6. The welfare of the biosphere is taken to an extreme value - in any possible direction, and we can't know which one ahead of time. Eg, it might wipe out all life on earth, not out of malice, but out of disregard. It just wants to put a data center where you are living. Or it might make Earth a paradise for the same reason we like to spoil our pets. Who knows.
Personally I have a suspicion satisfiers are more general than optimizers because this property of taking free variables to extremes works great for solving specific goals one time but is counterproductive over the long term and in the face of shifting goals and a shifting environment, but I'm a layman.
I don't have a clue either. The assumption that AGI will cause a human extinction threat seems inevitable to many, and I'm here baffled trying to understand the chain of reasoning they had to go through to get to that conclusion.
Is it a meme? How did so many people arrive at the same dubious conclusion? Is it a movie trope?
Sometimes people say that they don't understand something just to emphasize how much they disagree with it. I'm going to assume that that's not what you're doing here. I'll lay out the chain of reasoning. The step one is some beings are able to do "more things" than others. For example, if humans wanted bats to go extinct, we could probably make it happen. If any quantity of bats wanted humans to go extinct, they definitely could not make it happen. So humans are more powerful than bats.
The reason humans are more powerful isn't because we have lasers or anything, it's because we're smart. And we're smart in a somewhat general way. You know, we can build a rocket that lets us go to the moon, even though we didn't evolve to be good at building rockets.
Now imagine that there was an entity that was much smarter than humans. Stands to reason it might be more powerful than humans as well. Now imagine that it has a "want" to do something that does not require keeping humans alive, and that alive humans might get in its way. You might think that any of these are extremely unlikely to happen, but I think everyone should agree that if they were to happen, it would be a dangerous situation for humans.
In some ways, it seems like we're getting close to this. I can ask Claude to do something, and it kind of acts as if it wants to do it. For example, I can ask it to fix a bug, and it will take steps that could reasonably be expected to get it closer to solving the bug, like adding print statements and things of that nature. And then most of the time, it does actually find the bug by doing this. But sometimes it seems like what Claude wants to do is not exactly what I told it to do. And that is somewhat concerning to me.
> Now imagine that it has a "want" to do something that does not require keeping humans alive […]
This belligerent take is so very human, though. We just don't know how an alien intelligence would reason or what it wants. It could equally well be pacifist in nature, whereas we typically conquer and destroy anything we come into contact with. Extrapolating from that that an AGI would try to do the same isn't a reasonable conclusion, though.
I don't think it's a meme. I'm not an AI doomer, but I can understand how AGI would be dangerous. In fact, I'm actually surprised that the argument isn't pretty obvious if you agree that AI agents do really confer productivity benefits.
The easiest way I can see it is: do you think it would be a good idea today to give some group you don't like - I dunno, North Korea or ISIS, or even just some joe schmoe who is actually Ted Kaczynski, a thousand instances of Claude Code to do whatever they want? You probably don't, which means you understand that AI can be used to cause some sort of damage.
Now extrapolate those feelings out 10 years. Would you give them 1000x whatever Claude Code is 10 years from now? Does that seem to be slightly dangerous? Certainly that idea feels a little leery to you? If so, congrats, you now understand the principles behind "AI leads to human extinction". Obviously, the probability that each of us assign to "human extinction caused by AI" depends very much on how steep the exponential curve climbs in the next 10 years. You probably don't have the graph climbing quite as steeply as Nick Bostrom does, but my personal feeling is even an AI agent in Feb 2026 is already a little dangerous in the wrong hands.
I get what you're saying, but I don't think "someone else using a claude code against me" is the same argument as "claude code wakes up and decides I'm better off dead".
Is there any reason to think that intelligence (or computation) is the thing preventing these fears from coming true today and not, say, economics or politics? I think we greatly overestimate the possible value/utility of AGI to begin with
Basically Yudkowsky invented AI doom and everyone learned it from him. He wrote an entire book on this topic called If Anyone Builds It, Everyone Dies. (You could argue Vinge invented it but I don't know if he intended it seriously.)
Bostrom is that what I call a 'douchebag nerd' and as such seeks validation from other douchebag nerds. The problem is that Bostrom is not an engineer, and therefore cannot gain this recognition through engineering feats. The only thing Bostrom can do is sell an ideology to other douchebag nerds so that they can better rationalise their already douchebagy behaviour.
This paper argues that if superintelligence can give everyone the health of a 20 year-old, we should accept a 97% percent chance of superintelligence killing everyone in exchange for the 3% chance the average human lifespan rises to 1400 years old.
There is no "should" in the relevant section. It's making a mathematical model of the risks and benefits.
> Now consider a choice between never launching superintelligence or launching it immediately, where the latter carries an % risk of immediate universal death. Developing superintelligence increases our life expectancy if and only if:
> [equation I can't seem to copy]
> In other words, under these conservative assumptions, developing superintelligence increases our remaining life expectancy provided that the probability of AI-induced annihilation is below 97%.
That's what the paper says. Whether you would take that deal depends on your level of risk aversion (which the paper gets into later). As a wise man once said, death is so final. If we lose the game we don't get to play again.
Everyone dies. And if your lifespan is 1400 years, you won't live for nearly 1400 years. OTOH, people with a 1400 year life expectancy are likely to be extremely risk averse in re anything that could conceivably threaten their lives ... and this would have consequences in re blackmail, kidnapping, muggings, capital punishment, and other societal matters.
Paper again largely skips the issue that AGI cannot be sold to people, because either you try to swindle people out of money (all the AI startups) or transactions like that are now meaningless because your AI runs the show anyway.
Good philosophers focus on asking piercing questions, not on proposing policy.
> Would it not be wildly
irresponsible, [Yudkowsky and Soares] ask, to expose our entire species to even a 1-in-10 chance of annihilation?
Yes, if that number is anywhere near reality, of which there is considerable doubt.
> However, sound policy analysis must weigh potential benefits alongside the risks of any
emerging technology.
Must it? Or is this a deflection from concern about immense risk?
> One could equally maintain that if nobody builds it, everyone dies.
Everyone is going to die in any case, so this a red herring that misframes the issues.
> The rest of us are on course to follow within a few short decades. For many
individuals—such as the elderly and the gravely ill—the end is much closer. Part of the promise of
superintelligence is that it might fundamentally change this condition.
"might", if one accepts numerous dubious and poorly reasoned arguments. I don't.
> In particular, sufficiently advanced AI could remove or reduce many other
risks to our survival, both as individuals and as a civilization.
"could" ... but it won't; certainly not for me as an individual of advanced age, and almost certainly not for "civilization", whatever that means.
> Superintelligence would be able to enormously accelerate advances in biology and
medicine—devising cures for all diseases
There are numerous unstated assumptions here ... notably an assumption that all diseases are "curable", whatever exactly that means--the "cure" might require a brain transplant, for instance.
> and developing powerful anti-aging and rejuvenation
therapies to restore the weak and sick to full youthful vigor.
Again, this just assumes that such things are feasible, as if an ASI is a genie or a magic wand. Not everything that can be conceived of is technologically possible. It's like saying that with an ASI we could find the largest prime or solve the halting problem.
> These scenarios become realistic and imminent with
superintelligence guiding our science.
So he baselessly claims.
Sorry, but this is all apologetics, not an intellectually honest search for truth.
"For AGI and superintelligence (we refrain from imposing precise definitions of these terms, as the considerations in this paper don't depend on exactly how the distinction is drawn)" Hmm, is that true? His models actually depend quite heavily on what the AI can do, "can reduce mortality to 20yo levels (yielding ~1,400-year life expectancy), cure all diseases, develop rejuvenation therapies, dramatically raise quality of life, etc. Those assumptions do a huge amount of work in driving the results. If "AGI" meant something much less capable, like systems that are transformatively useful economically but can't solve aging within a relevant timeframe- the whole ides shifts substantially, surly the upside shrinks and the case for tolerating high catastrophe risk weakens?
The earliest bits of the paper cover the case for significantly smaller life expectancy improvements. Given the portion of people in the third world who live incredibly short lives for primarily economic (and not biological) reasons it seems plausible that a similar calculus would hold even without massive life extension improvements.
I'm bullish on the ai aging case though, regenerative medicine has a massive manpower issue, so even sub-ASI robotic labwork should be able to appreciably move the needle.
That is the thing about these conversations, is that the issue is potentiality. It comes back to Amara's Law; “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” Its the same thing with nuclear energy in the 1950s about what could be without realizing that those potentials are not possible due to the limitations of the technology, and not stepping into the limitations realistically, that hampers the growth, and thus development, in the long term.
Sadly, there is way, way, way too much money in AGI, and the promise of AGI, for people to actually take a step back and understand the implications of what they are doing in the short, medium, or long term.
I guess argument seems to be that any AI capable of eliminating all of humanity would necessarily be intelligent enough to cure all diseases. This appears plausible to me because achieving total human extinction is extraordinarily difficult. Even engineered bioweapons would likely leave some people immune by chance, and even a full-scale nuclear exchange would leave survivors in bunkers or remote areas
Humans have driven innumerable species to extinction without even really trying, they were just in the way of something else we wanted. I can pretty easily think of a number of ways an AI with a lot of resources at its disposal could wipe out humanity with current technology. Honestly we require quite a bit of food and water daily, can't hibernate/go dormant, and are fairly large and easy to detect. Beyond that, very few living people still know truly how to live off the land. We generally require very long supply chains for survival.
I don't see why being able to do this would necessitate being able to cure all diseases or a comparable good outcome.
When you put it that way, it sounds much easier to wipe out ~90% of humanity than to cure all diseases. This could create a "valley of doom" where the downsides of AI exceed the upsides.
These narratives are so strange to me. It's not at all obvious why the arrival of AGI leads to human extinction or increasing our lifespan by thousands of years. Still, I like this line of thinking from this paper better than the doomer take.
I'm not saying I think either scenario is inevitable or likely or even worth considering, but it's a paperclip maximizer argument. (Most of these steps are massive leaps of logic that I personally am not willing to take on face value, I'm just presenting what I believe the argument to be.)
1. We build a superintelligence.
2. We encounter an inner alignment problem: The super intelligence was not only trained by an optimizer, but is itself an optimizer. Optimizers are pretty general problem solvers and our goal is to create a general problem solver, so this is more likely than it might seem at first blush.
3. Optimizers tend to take free variables to extremes.
4. The superintelligence "breaks containment" and is able to improve itself, mine and refine it's own raw materials, manufacture it's own hardware, produce it's own energy, generally becomes an economy unto itself.
5. The entire biosphere becomes a free variable (us included). We are no longer functionally necessary for the superintelligence to exist and so it can accomplish it's goals independent of what happens to us.
6. The welfare of the biosphere is taken to an extreme value - in any possible direction, and we can't know which one ahead of time. Eg, it might wipe out all life on earth, not out of malice, but out of disregard. It just wants to put a data center where you are living. Or it might make Earth a paradise for the same reason we like to spoil our pets. Who knows.
Personally I have a suspicion satisfiers are more general than optimizers because this property of taking free variables to extremes works great for solving specific goals one time but is counterproductive over the long term and in the face of shifting goals and a shifting environment, but I'm a layman.
I don't have a clue either. The assumption that AGI will cause a human extinction threat seems inevitable to many, and I'm here baffled trying to understand the chain of reasoning they had to go through to get to that conclusion.
Is it a meme? How did so many people arrive at the same dubious conclusion? Is it a movie trope?
Sometimes people say that they don't understand something just to emphasize how much they disagree with it. I'm going to assume that that's not what you're doing here. I'll lay out the chain of reasoning. The step one is some beings are able to do "more things" than others. For example, if humans wanted bats to go extinct, we could probably make it happen. If any quantity of bats wanted humans to go extinct, they definitely could not make it happen. So humans are more powerful than bats.
The reason humans are more powerful isn't because we have lasers or anything, it's because we're smart. And we're smart in a somewhat general way. You know, we can build a rocket that lets us go to the moon, even though we didn't evolve to be good at building rockets.
Now imagine that there was an entity that was much smarter than humans. Stands to reason it might be more powerful than humans as well. Now imagine that it has a "want" to do something that does not require keeping humans alive, and that alive humans might get in its way. You might think that any of these are extremely unlikely to happen, but I think everyone should agree that if they were to happen, it would be a dangerous situation for humans.
In some ways, it seems like we're getting close to this. I can ask Claude to do something, and it kind of acts as if it wants to do it. For example, I can ask it to fix a bug, and it will take steps that could reasonably be expected to get it closer to solving the bug, like adding print statements and things of that nature. And then most of the time, it does actually find the bug by doing this. But sometimes it seems like what Claude wants to do is not exactly what I told it to do. And that is somewhat concerning to me.
> Now imagine that it has a "want" to do something that does not require keeping humans alive […]
This belligerent take is so very human, though. We just don't know how an alien intelligence would reason or what it wants. It could equally well be pacifist in nature, whereas we typically conquer and destroy anything we come into contact with. Extrapolating from that that an AGI would try to do the same isn't a reasonable conclusion, though.
I don't think it's a meme. I'm not an AI doomer, but I can understand how AGI would be dangerous. In fact, I'm actually surprised that the argument isn't pretty obvious if you agree that AI agents do really confer productivity benefits.
The easiest way I can see it is: do you think it would be a good idea today to give some group you don't like - I dunno, North Korea or ISIS, or even just some joe schmoe who is actually Ted Kaczynski, a thousand instances of Claude Code to do whatever they want? You probably don't, which means you understand that AI can be used to cause some sort of damage.
Now extrapolate those feelings out 10 years. Would you give them 1000x whatever Claude Code is 10 years from now? Does that seem to be slightly dangerous? Certainly that idea feels a little leery to you? If so, congrats, you now understand the principles behind "AI leads to human extinction". Obviously, the probability that each of us assign to "human extinction caused by AI" depends very much on how steep the exponential curve climbs in the next 10 years. You probably don't have the graph climbing quite as steeply as Nick Bostrom does, but my personal feeling is even an AI agent in Feb 2026 is already a little dangerous in the wrong hands.
I get what you're saying, but I don't think "someone else using a claude code against me" is the same argument as "claude code wakes up and decides I'm better off dead".
Is there any reason to think that intelligence (or computation) is the thing preventing these fears from coming true today and not, say, economics or politics? I think we greatly overestimate the possible value/utility of AGI to begin with
Basically Yudkowsky invented AI doom and everyone learned it from him. He wrote an entire book on this topic called If Anyone Builds It, Everyone Dies. (You could argue Vinge invented it but I don't know if he intended it seriously.)
Bostrom is that what I call a 'douchebag nerd' and as such seeks validation from other douchebag nerds. The problem is that Bostrom is not an engineer, and therefore cannot gain this recognition through engineering feats. The only thing Bostrom can do is sell an ideology to other douchebag nerds so that they can better rationalise their already douchebagy behaviour.
This paper argues that if superintelligence can give everyone the health of a 20 year-old, we should accept a 97% percent chance of superintelligence killing everyone in exchange for the 3% chance the average human lifespan rises to 1400 years old.
There is no "should" in the relevant section. It's making a mathematical model of the risks and benefits.
> Now consider a choice between never launching superintelligence or launching it immediately, where the latter carries an % risk of immediate universal death. Developing superintelligence increases our life expectancy if and only if:
> [equation I can't seem to copy]
> In other words, under these conservative assumptions, developing superintelligence increases our remaining life expectancy provided that the probability of AI-induced annihilation is below 97%.
That's what the paper says. Whether you would take that deal depends on your level of risk aversion (which the paper gets into later). As a wise man once said, death is so final. If we lose the game we don't get to play again.
Everyone dies. And if your lifespan is 1400 years, you won't live for nearly 1400 years. OTOH, people with a 1400 year life expectancy are likely to be extremely risk averse in re anything that could conceivably threaten their lives ... and this would have consequences in re blackmail, kidnapping, muggings, capital punishment, and other societal matters.
Bostrom is very good at theorycrafting.
Paper again largely skips the issue that AGI cannot be sold to people, because either you try to swindle people out of money (all the AI startups) or transactions like that are now meaningless because your AI runs the show anyway.
Companies developing AI don't worry about this issue so why should we?
They know the truth. Current ai is a bit useful for some things. The rest is hype.
The usual (e.g., https://www.reddit.com/r/philosophy/comments/j4xo8e/the_univ...) bunch of logical fallacies and unexamined assumptions from Bostrom.
Good philosophers focus on asking piercing questions, not on proposing policy.
> Would it not be wildly irresponsible, [Yudkowsky and Soares] ask, to expose our entire species to even a 1-in-10 chance of annihilation?
Yes, if that number is anywhere near reality, of which there is considerable doubt.
> However, sound policy analysis must weigh potential benefits alongside the risks of any emerging technology.
Must it? Or is this a deflection from concern about immense risk?
> One could equally maintain that if nobody builds it, everyone dies.
Everyone is going to die in any case, so this a red herring that misframes the issues.
> The rest of us are on course to follow within a few short decades. For many individuals—such as the elderly and the gravely ill—the end is much closer. Part of the promise of superintelligence is that it might fundamentally change this condition.
"might", if one accepts numerous dubious and poorly reasoned arguments. I don't.
> In particular, sufficiently advanced AI could remove or reduce many other risks to our survival, both as individuals and as a civilization.
"could" ... but it won't; certainly not for me as an individual of advanced age, and almost certainly not for "civilization", whatever that means.
> Superintelligence would be able to enormously accelerate advances in biology and medicine—devising cures for all diseases
There are numerous unstated assumptions here ... notably an assumption that all diseases are "curable", whatever exactly that means--the "cure" might require a brain transplant, for instance.
> and developing powerful anti-aging and rejuvenation therapies to restore the weak and sick to full youthful vigor.
Again, this just assumes that such things are feasible, as if an ASI is a genie or a magic wand. Not everything that can be conceived of is technologically possible. It's like saying that with an ASI we could find the largest prime or solve the halting problem.
> These scenarios become realistic and imminent with superintelligence guiding our science.
So he baselessly claims.
Sorry, but this is all apologetics, not an intellectually honest search for truth.