How to come up with good prompts for AI image generation

September 10, 2022



In this post, I’ll teach you how to create good prompts for generating AI art work images. I will base my method on Stable Diffusion, an open-source text-to-image AI model.

( If you are new to AI art, check out this Quick Start Guide )

What is Stable Diffusion?

Stable Diffusion is a text-to-image AI model. It is trained on 2.3 billion image and text description pairs. Because it has seen so much, the model encodes relationship between image pixel values and it’s text descriptions.

As a result, if you put in description like “A Photo of a cat sitting on top of a building”, it would give you images like these:

A cat on top of building generated with stable diffusion

You may be thinking what’s so special about these images? Couldn’t we get millions of them in a Google search? What’s intriguing about this technology is that you can prompt the model to generate high quality images that do not exist before. For example, you can ask for a portrait painting of Emma Watson by the 19th century American painter John Singer Sargent:

Emma Watson's portrait. John Singer Sargent painting

It is incredible that such images can be produced from keyword-pixel correlations! What’s mind-boggling is that it gets the artistic style, faces (which our brains are very unforgiving of tiny mistakes) and shadows correctly, and blends them all together in an aesthetically pleasing manner. I believe that the wonder of large numbers is beyond the comprehension of human minds.

What so special about Stable Diffusion?

This year we have seen a few image generation AI such as DALLE 2 and MidJourney. They too are capable of generating stunning images from text prompts. What’s so special about Stable Diffusion are

  1. Open-source.
  2. Low computer hardware requirement.

The implication of these two together is big. You can download the model and run it on your local computer. Now there are PC and Apple version available. There’s an explosion of free-to-use image generation AI powered by Stable Diffusion. The low cost of run allows entrepreneurs to explore Freemium or Ad-supported business models. The end result is going to be making the AI technology more accessible.

Where can I try my prompts?

The easiest way to use Stable Diffusion is via DreamStudio.AI. It is from the creator of Stable Diffusion. You will get some free credits after signing up.

Anatomy of a good prompt

There are proven techniques to generate high quality, specific images. Your prompt should cover most if not all of these areas

  1. Subject (required)
  2. Medium
  3. Style
  4. Artist
  5. Website
  6. Resolution
  7. Additional details
  8. Color

First you will need a description of the subject with as much detail as possible. E.g.

Subject

A young woman with light blue dress sitting next to a wooden window reading a book.

We got the following image, which matches the prompt pretty well.

Stable diffusion, woman in blue dress reading a

We can be more specific. Let’s add a medium. Some examples are: digital painting, photograph, oil painting. Let’s use

Medium

Digital painting

The new prompt is

Digital painting of a young woman with light blue dress sitting next to a wooden window reading a book

The resulting image is

Stable diffusion, Digital painting of a young woman with light blue dress sitting next to a wooden window reading a book

You can see the image changes from photograph to digital art.

You get the idea. Let’s define the rest of them

Artist

by Stanley Artgerm Lau

Website

artstation

Resolution

8k

Additional details

extremely detailed, ornate, cinematic lighting

color

vivid

Putting them all together, the prompt is

Digital painting of a young woman with light blue dress sitting next to a wooden window reading a book, by Stanley Artgerm Lau, artstation, 8k, extremely detailed, ornate, cinematic lighting, vivid.

which generates this image:

Digital painting of a young woman with light blue dress sitting next to a wooden window reading a book, by Stanley Artgerm Lau, artstation, 8k, extremely detailed, ornate, cinematic lighting, vivid

By adding keywords to the prompt, we can engineer the image to get the style we want.

Tips for good prompts

  • Be specific in subject.
  • Use multiple exclamation marks !! to stress a word and brackets () to reduce its strength.
  • Use appropriate medium type consistent with the artist.
  • Artist name is a very strong style modifier. Use wisely.
  • Experiment with blending styles.
  • Use websites like Lexica to study other people’s prompts. If you like a particular image, use the prompt as starting point.

Some good keywords for you

Below are some of my favorite keywords and their effects. (Tested with Stable Diffusion v1.4)

Enjoy!

Medium

Medium defines a category of the artwork.

keyword Note
Portrait Focuses image on the face / headshot.
Digital painting Digital art style
Concept art Illustration style, 2D
Ultra realistic illustration drawing that are very realistic. Good to use with people
Underwater portrait Use with people. Underwater. Hair floating
Underwater steampunk underwater with wash color

Style

These keywords further refine the art style.

keyword Note
hyperrealistic Increases details and resolution
pop-art Pop-art style
Modernist vibrant color, high contrast
art nouveau Add ornaments and details, building style

Artist

Mentioning the artist in the prompt is a strong effect. Study their work and choose wisely.

keyword Note
John Collier 19th century portrait painter. Add elegancy
Stanley Artgerm Lau Strong realistic modern drawing.
Frida Kahlo Quite strong effect following Kahlo’s portrait style. Sometimes result in picture frame
John Singer Sargent Good to use with woman portrait, generate 19th delicate clothings, some impressionism
Alphonse Mucha 2D portrait painting in style of Alphonse Mucha

Website

Mentioning an art or photo site is a strong effect, probably because each site has its niche genre.

keyword Note
pixiv Japanese anime style
pixabay Commercial stock photo style
artstation Modern illustration, fantasy

Resolution

keyword Note
unreal engine Very realistic and detailed 3D
sharp focus Increase resolution
8k Increase resolution, though can lead to it looking more fake. Makes the image more camera like and realistic
vray 3D rendering best for objects, landscape and building.

Additional details

Add specific details to your image.

keyword Note
dramatic Increases the emotional expressivity of the face. Overall substantial increase in photo potential / variability. +1 for variability, important for getting the max hit.
silk Add silk to clothing
expansive More open background, smaller subject
low angle shot shot from low angle **
god rays sunlight breaking through the cloud
psychedelic vivid color with distortion

Color

Add additional color scheme to the image.

keyword Note
iridescent gold Shinny gold
silver Silver color
vintage vintage effect

Summary

In this post, we have gone through the basic structure of a good prompt. This should be used as a guide rather than rules. The Stable Diffusion model is very flexible. Let it surprise you with some creative combination of keywords!

If you have problem generating stunning artworks, this Stable Diffusion prompt generator would be able to help you. In the next post, I will show you how to make this prompt generator using Notion.



By Andrew Wong, Software Engineer @ Sagio Dev

Follow Andrew on Twitter to see more contents like this.