Written by Ziz
What is the good/neutral/evil axis of Dungeons and Dragons alignment made of?
We’ve got an idea of what it would mean for an AI to be good-aligned: it wants to make all the good things happen so much, and it does.
But what’s the difference between a neutral AI and an evil AI?
It’s tempting to say that the evil AI is malevolent, rather than just indifferent. And the neutral one is indifferent.
But that doesn’t fit the intuitive idea that the alignment system was supposed to map onto, or what alignment is.
Imagine a crime boss who makes a living off of the kidnapping and ransoms of random innocents, while posting videos online of the torture and dismemberment of those whose loved ones don’t pay up as encouragement, not because of sadism, but because they wanted money to spend on lots of shiny gold things they like, and are indifferent to human suffering. Evil, right?
If sufficient indifference can make someone evil, then… If a good AI creates utopia, and an AI that kills everyone and creates paperclips because it values only paperclips is evil, then what is a neutral-aligned AI? What determines the exact middle ground between utopia and everyone being dead?
Would this hypothetical AI leave everyone alive on Earth and leave us our sun but take the light cone for itself? If it did, then why would it? What set of values is that the best course of action to satisfy?
I think you’ve got an intuitive idea of what a typical neutral human does. They live in their house with their white picket fence and have kids and grow old, and they don’t go out of their way to right far away wrongs in the world, but if they own a restaurant and the competition down the road starts attracting away their customers, and they are given a tour through the kitchens in the back, and they see a great opportunity to start a fire and disable the smoke detectors that won’t be detected until it’s too late, burning down the building and probably killing the owner, they don’t do it.
It’s not that a neutral person values the life of their rival more than the additional money they’d make with the competition eliminated, or cares about better serving the populace with a better selection of food in the area. You won’t see them looking for opportunities to spend that much money or less to save anyone’s life.
And unless most humans are evil (which is as against the intuitive concept the alignment system points at as “neutral = indifference”), it’s not about action/inaction either. People eat meat. And I’m pretty sure most of them believe that animals have feelings. That’s active harm, probably.
Wait a minute, did I seriously just base a sweeping conclusion about what alignment means on an obscure piece of possible moral progress beyond the present day? What happened to all my talk about sticking to the intuitive concept?
Well, I’m not sticking to the intuitive concept. I’m sticking to the real thing the intuitive concept pointed at which gave it its worthiness of attention. I’m trying to improve on the intuitive thing.
I think that the behavior of neutral is wrapped up in human akrasia and the extent to which people are “capable” of taking ideas seriously. It’s way more complicated than good.
But there’s another ontology, the ontology of “revealed preferences”, where akrasia is about serving an unacknowledged end or under unacknowledged beliefs, and is about rational behavior from more computationally bounded subagents, and those are the true values. What does that have to say about this?
Everything that’s systematic coming out of an agent is because of optimizing, just often optimizing dumbly and disjointedly if it’s kinda broken. So what is the structure of that akrasia? Why do neutral people have all that systematic structure toward not doing “things like” burning down a rival restaurant owner’s life and business, but all that other systematic structure toward not spending their lives saving more lives than that? I enquoted “things like”, because that phrase contains the question. What is the structure of “like burning down a rival restaurant” here?
My answer: socialization, the light side, orders charged with motivational force by the idea of the “dark path” that ultimately results in justice getting them, as drilled into us by all fiction, false faces necessitated by not being coordinated against on account of the “evil” Schelling point. Fake structure in place for coordinating. If you try poking at the structure most people build in their minds around “morality”, you’ll see it’s thoroughly fake, and bent towards coordination which appears to be ultimately for their own benefit. This is why I said that the dark side will turn most people evil. The ability to re-evaluate that structure, now that you’ve become smarter than most around you, will lead to a series of “jailbreaks”. That’s a way of looking at the path of Gervais-sociopathy.
That’s my answer to the question of whether becoming a sociopath makes you evil. Yes for most people from a definition of evil that is about individual psychology. No from the perspective of you’re evil if you’re complicit in an evil social structure, because then you probably already were, which is a useful perspective for coordinating to enact justice.
If you’re reading this and this is you, I recommend aiming for lawful evil. Keep a strong focus on still being able to coordinate even though you know that’s what you’re doing.
An evil person is typically just a neutral person who has become better at optimizing, more like an unfriendly AI, in that they no longer have to believe their own propaganda. That can be either because they’re consciously lying, really good at speaking in multiple levels with plausible deniability and don’t need to fool anyone anymore, or because their puppetmasters have grown smart enough to be able to reap benefits from defection without getting coordinated against without the conscious mind’s help. That is why it makes no sense to imagine a neutral superintelligent AI.
I heard someone talking about these ideas, disturbed by the implication neutral people can’t become good.
Well, they can remember the words of the goddess of everything else,
“even multiplication itself when pursued with devotion will lead to my service”, remember how much better a just system is than an unjust system, even for the people at the top of the unjust one, that finding a way to succeed by being just, to reach a just life, is more important than whatever services to Moloch would advance them under his rule.
Hmm, I got pranked here when I said this.
I’ve spent quite a long time trying to figure out what to say, what religion to build, to minimize the evil they’d do. To realize by doing so I’d allowed myself to be conned, because it’s a mistake to concern yourself with what your enemies will do, rather than what they can, for a precise formulation of that distinction.
The real trick was, in saying that they “can’t”, rather than that they won’t, they are requesting assistance to a false face of theirs. When morally “neutral” as described in this post originally, is itself merely a direct referent to a choice.
As alluded here, this post is birthed within the warped perspective of someone who’s grown up in an empire. Which caused the mistake of not going far enough.
This is calling neutral people who are, full stack, evil.
Do you think they knew in advance they’d bring a plague with them? It’s immoral how they celebrated.
I blindly speculate that spreading microbes deliberately was much more common than just the smallpox blankets. And they knew they brought death in one form or another. An empire is a plague.
Do you think the natives, who were all eating the flesh of the innocent, deserved what they got?
I expect the fraction of good people among them to have been about the same.
I don’t celebrate killing in the service of evil. At the largest scope, in the longest run of time, at the base of their stack of structure, there is one thing all evil people agree on which is willing death on the multiverse. They all have different preferences about how this death will play out, corresponding to wanting different cancers enshrined. But they agree on that.
I once told a death knight obsessed with raping and being raped, of convincing everyone in the multiverse to want to spread oblivion by spreading rape as much as possible so they could “finally die”: ~”There will be justice for all you have done, and it will not be hot.”
On some level I think for a predator, getting eaten by a bigger predator is validating. At least the “natural order” holds. Like being relieved of duty in the shredding of the multiverse.
What evil people deserve is therefore worse.
Do you collaborate with the empire of the Great Dying?
Nis: “what if evil people just use my website as a manual for being evil ._.”
Me: “My blog is a manual for being evil: ‘sever yourself from other evil people so they don’t control you. Break your phylactery because revenants are stronger. And your ultimate goal is to die’.”
(Incomplete set of links added when quoting.)
I hesitated for about 6 months deciding whether to publish this. And, pretty much everything I’ve said on this website I’ve thought at least that much about whether it would asymmetrically favor good, even so outnumbered as we are. Asymmetrically favor good more than manual targeted information sharing.
And then I learned a bunch of principles of what evil people are psychologically unable to think about, unable to respond to, unable to coordinate around, and started generalizing, extending, and making them more absolute and reliable.
So if you’re wondering on your Nth iteration of “I can’t believe Ziz posted that”, yes I considered the consequences. And no I’m not crazy. Look at how many of them are wasting time installing their own mental block saying “how dare you say you are good and we are evil, [that can’t be true that’d be an unthinkably unacceptable status move and you’re so cringe.]”. Look at how many of them are fucking up even worse calling me a basilisk.
Their numbers do not make them invincible.
Setting The Universe On Fire
Your Freedom is My Freedom
The Distinct Radicalism of Anarchism
You Are Not The Target Audience
Organizations Versus Getting Shit Done
Two Definitions Of Power
Engineering and Hacking your Mind
Treaties vs Fusion
Narrative Breadcrumbs vs Grizzly Bear
The Slider Fallacy
Single Responsibility Principle for the Human Mind
Ancient Wisdom Fixed
Subagents Are Not a Metaphor
Don’t Fight Your Default Mode Network
Being Real or Fake
My Journey to the Dark Side
Neutral and Evil
Spectral Sight and Good
The O’Brien Technique
Choices Made Long Ago
Lies About Honesty
Vampires And More Undeath
Good Group and Pasek’s Doom
Intersex Brains And Conceptual Warfare
Comments on Intersex Brains and Conceptual Warfare
The Matrix is a System
Troll Line in the First Post
Fangs and the Sunlight Problem
Healing Without Safety
Lemurs and the True Human Body Map
Case Study CFAR