Bringing D&D/AD&D campaign settings to life with Stable Diffusion

deuxhero · Sep 14, 2022

Can you try this one? One of Eberron's major characters suspiciously lacks any art (the other national leaders all have art somewhere).

Five Nations said:
Eleven-year-old Jaela Daran came from humble origins. [...] Jaela herself seems rather humble, modest, and meek for a young girl whose pronouncements alter the history of a nation. [...] Jaela usually dresses in simple gray or black clothes, walking barefoot on the marble steps of the Cathedral. She has gray eyes, short-cropped dark hair, and a chocolate-colored complexion [...] she carries the burden of an entire nation on her slim shoulders

(her equipment entry also mention she wears a silver arrowhead as a holy symbol)

Zed Duke of Banville · Sep 14, 2022

"Eleven-year-old Jaela Daran came from humble origins. [...] Jaela herself seems rather humble, modest, and meek for a young girl whose pronouncements alter the history of a nation. [...] Jaela usually dresses in simple gray or black clothes, walking barefoot on the marble steps of the Cathedral. She has gray eyes, short-cropped dark hair, and a chocolate-colored complexion [...] she carries the burden of an entire nation on her slim shoulders"
"(her equipment entry also mention she wears a silver arrowhead as a holy symbol)"

Stable Diffusion can't even handle weapons correctly, so I didn't bother trying to manage a specific holy symbol. Putting the other characteristics into the prompts without being specific about age and using the common Mucha/Rutkowski/Artgerm combination yielded this portrait in the first batch of five images:

Some of the images show her barefoot, but this kind of thing is probably a distraction from the more important prompts given the limited resolution of the output. Some other possibilities (2nd for an older version):

Specifying 11 years old seems to result in the images more accurately capturing that age but at the expense of other characteristics:

Non-Edgy Gamer · Sep 14, 2022

Zed Duke of Banville if you intend to spend any significant amount of time on this, you owe it to yourself to check out the Krita plugins for SD.
https://github.com/sddebz/stable-diffusion-krita-plugin

Basically turns Krita into Photoshop on crack. If there's something you don't like about an image, select that part, feather the selection to blend in the result and either alter the prompt, or scribble whatever you want there and regenerate it.

It's especially nice for fixing things like eyes, even if only as a first step to fix them with GFPGAN later.

Zed Duke of Banville · Sep 15, 2022

Non-Edgy Gamer said:
Zed Duke of Banville if you intend to spend any significant amount of time on this, you owe it to yourself to check out the Krita plugins for SD.
https://github.com/sddebz/stable-diffusion-krita-plugin

I've been using Stable Diffusion's text2img for other, non-gazetteer-related images, and have been attempting to use its img2img function as well. I intend also to determine how to improve StableDiffusion's output using these kinds of plug-ins or add-ons, but the point of this thread was to demonstrate that text2img can be used to create, fairly quickly and fairly easily, worthwhile portraits reflecting the characters they are intended to represent.

GAZ8 The Five Shires returns to demi-humans, this time with the realm of the ~~hobbits~~ halflings. Since I've never cared for halflings, and this gazetteer was written by Ed Greenwood, I couldn't bring myself to create more than four portraits, even though there is an extended section on notable personages:

Jaervosz Dustyboots, one of the five sheriffs of the shires:

Joam Astlar, an adventurer:

Meermeera Jollybars, a full-figured and merry brunette:

Shandysar Lollos, a female ex-pirate now wandering the shires:

GAZ9 The Minrothad Guilds is unfortunately another gazetteer about a country lacking a firm identity and only containing details for a few people, but unlike Ierendi this gazetteer was also rather boring and the first to go out of print. For this one, I created two portraits each for four characters, with one portrait in the usual Mucha/Rutkowski/Artgerm combination and the other portrait based on John Singer Sargent, Eugene Galien-Laloue, and Edouard Leon Cortes:

Nosmo Beldan, a wealthy merchant:

Harmon Caetros, a guild agent in Karameikos:

Ariana Demerick, a female pirate:

Generic male pirate:

Zed Duke of Banville · Sep 16, 2022

GAZ10 The Broken Lands poses a greater challenge since the inhabitants are humanoids: trolls, ogres, gnolls, kobolds, goblins, hobgoblins, bugbears, and three kinds of orcs. There are certain difficulties that result from Stable Diffusion prompts:

Gnoll seems not to be interpreted, but "hyena humanoid" works well
Kobold seems not to be interpreted, and since kobolds have always been depicted as scaly dog-creatures it's extremely difficult to obtain something that looks right
Hobgoblin seems not to be interpreted, but they can be considered larger goblins anyway
Bugbears seem to be drawing on art that depicts them as bear-men rather than larger goblinoids with a vague resemblance to bears

Nonetheless, it was possible to mostly achieve decent results for the leaders of the Broken Lands tribes as described in the gazetteer:

Haa'k Hordar, the Troll Queen:

Alebane, Chief of the Ogres:

Nizam, Pasha of the Gnolls:

Kol XV, High Doge of the Kobolds:

Doth, King of the Goblins, is really the powerless consort of Queen Yazar:

Yazar, Queen of the Goblins, is a powerful warrior, goes about scantily dressed, is beautiful, and yet is a goblin; this might capture three of the four:

Hutal-Khan, of the Hobgoblins:

Ohr'r, Chief of the Bugbears, shouldn't resemble a bear this much but looks somewhat cool:

Hoolg Red-Mane, Chief of the Red Orcs:

Moghul-Khan, of the Yellow Orcs:

Thar, Chief of the Common Orcs and King of the Broken Lands:

Lagole Gon · Sep 18, 2022

Can this shit generate ink or pencil pictures?

Dexter · Sep 19, 2022

Lagole Gon said:
Can this shit generate ink or pencil pictures?

Yes, you can even indulge in your Alien Horror scenes or pornographic fantasies as Japanese wood carvings or Renaissance Stained Glass Windows, you just have to specify the art style:

Jasede · Sep 19, 2022

The hands kind of ruin every picture for me.

Zed Duke of Banville · Sep 19, 2022

GAZ11 The Republic of Darokin details a confederation of former city-states dominated by merchant houses. Although the depiction of Darokin in this gazetteer is English in many ways, the earlier inspiration for this country seems to have been Renaissance Italy, and it also bears similarities to the low countries, so it is perhaps suited for a baroque artstyle based on painters such as Rembrandt and Caravaggio. As with some of the other gazetteers, there isn't a proper notable personages section, although it does detail a few scattered people. I also created a few scenic images with a 50% greater height or width, depending on the subject:

Member of the Darokin Diplomatic Corps:

Female merchant:

Beggar who engages in criminal activity on the side:

Female cleric:

City Market:

Church exteriors and interiors:

Palace exteriors and interiors:

Itheldown Castle, a cursed and haunted ruin, in the Mucha/Rutkowski/Artgerm style:

JamesDixon · Sep 19, 2022

The buildings look pretty good.

Zed Duke of Banville · Sep 22, 2022

The Dawn of the Emperors Box Set was published in 1989, describing the empires of Thyatis and Alphatia. Thyatis is based on the Roman or Byzantine empires, and therefore is suited for a style based on painters such as John William Waterhouse, Frederick Leighton, and Lawrence Alma-Tadema, who frequently depicted classical settings.

Emperor Thincol Torion, with dark brown hair, black eyes, tall, muscular, hawklike features, and a purple gold-lined toga (though he should be clean-shaven):

Empress Gabriela Torion, middle-aged, black hair, brown eyes, careworn and depressed:

Prince Eusebius Torion, brown hair, brown beard, brown eyes, tall, muscular, craggy features, wearing Roman armor, impassive, military bearing, cold:

Princess Stefania Torion, with red hair, blue eyes, and green garb:

Demetrion Karagenterpolus, the Imperial court wizard, white hair, white beard, white robes with red trim, elderly, honorable and trustworthy:

Anaxibius, a gladiator, black hair, black eyes, tall, muscular, wearing Roman gladiator armor:

The Coliseum of Thyatis City (all four images from one batch of five images):

Public baths:

The Great Library of Thyatis:

The Imperial Palace, a five-story building, huge and luxurious:

Lagole Gon · Sep 22, 2022

Dexter said:
Lagole Gon said:

Can this shit generate ink or pencil pictures?

Click to expand...

Yes, you can even indulge in your Alien Horror scenes or pornographic fantasies as Japanese wood carvings or Renaissance Stained Glass Windows, you just have to specify the art style:

It all looks like semi-animu.
Can this shit generate GOOD ink or pencil pictures?

Something like...

Frank Frazetta:

Mark Shultz:

mark-schultz-mark-schultz-robert-e.-howards-conan-of-cimmeria:-1932-1933-vol.-1-original-art-(wandering-star.jpg

rusty_shackleford · Sep 22, 2022

Lagole Gon said:
Dexter said:

Lagole Gon said:

Can this shit generate ink or pencil pictures?

Click to expand...

Yes, you can even indulge in your Alien Horror scenes or pornographic fantasies as Japanese wood carvings or Renaissance Stained Glass Windows, you just have to specify the art style:

Click to expand...

It all looks like semi-animu.
Can this shit generate GOOD ink or pencil pictures?

Something like...

Frank Frazetta:

Mark Shultz:

it sucks at hands & weapons though, definitely needs to be trained on those more

Lagole Gon · Sep 23, 2022

I'm not impressed by the "pencil" stuff.
And I can see the AI is practicing Doman style of freestyle muscle drawing.

Dexter · Sep 23, 2022

Lagole Gon said:
Dexter said:

Lagole Gon said:

Can this shit generate ink or pencil pictures?

Click to expand...

Yes, you can even indulge in your Alien Horror scenes or pornographic fantasies as Japanese wood carvings or Renaissance Stained Glass Windows, you just have to specify the art style:

Click to expand...

It all looks like semi-animu.
Can this shit generate GOOD ink or pencil pictures?

Something like...

Frank Frazetta:

Mark Shultz:

That's because it is in semi-Animu style by the guy that made that picture to demonstrate, I did a bunch of Frazetta inspired stuff along with Luis Royo and Kentaro Miura in the General Gaming thread:

Scarlett-Johansson-Fantasy-by-Frank-Frazetta-Kentaro-Miura.jpg

Some more on the fantasy side with different modifiers and character classes:

I can try to do some Barbarians for you, starting with:
"Close portrait of a male human barbarian (((Conan))) DnD pencil/ink art by Frank Frazetta"

Okay, so what happened here? Frazetta is a good artist and he's a strong vector, as are Conan (which I chose to emphasize) and DnD already giving some pretty good non-cherrypicked results. He's good with body proportions, composition and attire, but he's not the strongest with faces.

So what do we do? We complement him with other strong vectors to guide the AI to give us better results, things like Conan, DnD, LotR and similar can be used as a style guide, but if we want better faces we need a good portrait artist e.g.: https://www.wikiart.org/en/john-william-waterhouse/
"Close portrait of a male human barbarian (((Conan))) DnD LotR pencil art by John William Waterhouse and Frank Frazetta"
"Close portrait of a male human barbarian (((DnD))) LotR ink art by John William Waterhouse and Frank Frazetta"

Here are our next non-cherrypicked results:

More Conan:

More DnD:

Adding in some Crosshatching for Style, more Conan:

More DnD:

If we want an even stronger vector for the image with an even more distinct and clear face a well-known/often photographed celebrity is useful, so we can try something like:
"Close portrait of Arnold Schwarzenegger as Conan the Barbarian ink art by John William Waterhouse and Frank Frazetta"

If we add DnD/LotR:

It's all about what you put in the prompt and how you balance, emphasize and complement things.

What it can't do well (yet) is action shots and the likes or images where several objects have complex relations to or interact with one another, for instance you'll usually get garbage if you try to replicate the image above by typing something like:
"Male human barbarian holding a sexy Amazon woman is jumping for a vine in the jungle with an Aztec temple and trees in the background pencil art by Frank Frazetta"

Even if you add several style vectors:

Lagole Gon · Sep 23, 2022

Welp. I must admit, it did much better than I expected. It has strong souless uncanny valley whiff about it, but it is impressive.
I guess next step is to make AI paint over simple 3d shapes. If somebody does that I can see this shit exploding.

Art students on suicide watch.

I'm trying to make Justin Sweet/Vance Kovacs style Icewind Dale portrait but it seem the AI is mostly inspired by shitty fanart.

Non-Edgy Gamer · Sep 23, 2022

Jasede said:
The hands kind of ruin every picture for me.

Until it's properly trained on them, a temporary solution is to put "hands" in as a negative prompt, causing the AI to try to stay away from drawing them. So you'll get more shots with the hands out of frame, behind their back etc.

Dexter · Sep 23, 2022

Lagole Gon said:
It has strong souless uncanny valley whiff about it, but it is impressive.

A lot of the double heads/double arms/sword at wrong place artifacts are because the image isn't 512x512 like Zed has been doing, which is the resolution the model was trained at. But quadratic portraits are kind of boring, so it's usually fine if only every 2nd or 3rd image is usable, just make larger batches.

Lagole Gon said:
I guess next step is to make AI paint over simple 3d shapes. If somebody does that I can see this shit exploding.

It's already kind of exploding and chances are some AI tool will be able to do things like 3D models from pictures within a few years if that's what you mean: https://rpgcodex.net/forums/threads...ted-images-as-art.143986/page-11#post-8108268

Lagole Gon said:
I'm trying to make Justin Sweet/Vance Kovacs style Icewind Dale portrait but it seem the AI is mostly inspired by shitty fanart

They're probably not famous enough and not prominently included in the training data, can't find them listed here for instance: https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/artists.csv
Can only find a few images, some of which are repeated up to 7 times scraped from different Websites in the LAION-Aesthetics v2 6+ database:
https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/images?_search=Justin+Sweet&_sort=rowid
https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/images?_search=Vance+Kovacs&_sort=rowid

SD has afaik been trained on a Subset of images (hearing conflicting info between 800m and 2B picked for aesthetics) of LAION-5B and I can't even find them there. See for instance what comes up for "Frank Frazetta" compared to their names:
https://rom1504.github.io/clip-retrieval/?back=https://knn5.laion.ai&index=laion5B&useMclip=false&query=Frank+Frazetta
https://rom1504.github.io/clip-retrieval/?back=https://knn5.laion.ai&index=laion5B&useMclip=false&query=Justin+Sweet
https://rom1504.github.io/clip-retrieval/?back=https://knn5.laion.ai&index=laion5B&useMclip=false&query=Vance+Kovacs

You could try Textual Inversion and train them yourself on 5 strong example images if you're set on it and have enough VRAM: https://rentry.org/textard

Non-Edgy Gamer · Sep 23, 2022

Lagole Gon said:
I guess next step is to make AI paint over simple 3d shapes. If somebody does that I can see this shit exploding.

Unless you mean texturing 3D objects, it can technically do that with img2img, depending on how high you set your diffusion.

Works on sketches too.

It's not an exact science yet though. And it's obviously limited by the source to some degree.

Bigfass · Sep 23, 2022

Popiel said:
Fuck's sake, the rate at which this is accelerating most of the digital artists/concept artists/so on will be begging on the streets in like, what, 2 to 3 years.

Lagole Gon said:
Art students on suicide watch.

Not really.

All the publicly known text-to-image AI systems have a serious problem with what they call compositionality. When you tell the AI to draw a picture of a man on a horse, it has no idea who's the man, who's the horse, and what their relationship is supposed to be in the picture. It will give you a result that will very likely have both a man and a horse, and the man's likely to be riding the horse (and not the other way around) but only because that is the scenario that it's familiar with based on its training data.

If you try anything even a bit more complicated, it will fall on its ass. A prompt for "a man on a horse holding a cat that's wearing a top hat" is unlikely to put the hat on the cat, even if it puts the cat in the man's hands (which is far from guaranteed).

Asking for something unusual is likely to give you nonsense; "a baby holding up his mother" did not give me a single picture of a super-strong baby lifting a woman. But hey, at least the result was diverse.

This doesn't seem to be a problem that can be "fixed" as the issue is likely to be foundational. When the "AI" processes language, there's no resulting mental model (like there is with humans), no understanding of what's been communicated. It's not even a text-to-image problem, but an "AI" problem in general.

This guy writes a lot about this stuff:

https://garymarcus.substack.com/archive

Jasede · Sep 23, 2022

Very good post.

rusty_shackleford · Sep 23, 2022

Everyone knows technology is completely stagnant and never moves forward.

Non-Edgy Gamer · Sep 23, 2022

It was just a few months ago that people were posting Dalle Mini memes and laughing at the idea of image generation ever being useful.

Now there are a dozen articles a week from Increasingly Nervous Man telling you that AI is a dead end.

Bigfass · Sep 23, 2022

Non-Edgy Gamer said:
Now there are dozens of articles from Increasingly Nervous Man telling you that AI is a dead end.

To be fair most of the criticism is towards journalists and bloggers who play with DALL-E for an hour then proclaim that it'll imminently change the world as we know it. Which in turn fuels stuff like this:

But yeah, Increasingly Nervous Man is saying exactly that, because intelligence requires a cognitive model of the world, or at least the given task.

A household robot that does the laundry but never puts the dog in the dryer is a great thing to aspire to but it does require intelligence.