Getting Started with Stable Diffusion

Usually, when writing articles, there is a need for images, so I came to learn and record this open-source AI painting tool, Stable Diffusion.

What is Stable Diffusion#

Converts text information into image information through Prompts (hints/descriptions).

Stable Diffusion (SD) is an open-source AIGC painting model characterized by being open-source, fast, and rapidly updated.

How to Use#

Install GUI#

To facilitate use, you need to install a WebGUI for SD first.
Installation link: https://github.com/AUTOMATIC1111/stable-diffusion-webui
There are two types of installation: one is to deploy it on Google's google.colab (an online running environment), and the other is to run it locally on your own machine.

Local Installation Steps#

Since my computer is a Mac and uses Apple Silicon chips, the following steps are for this type of machine only.

Fresh Installation
If you haven't installed it before, you can do so via Homebrew.

If you don't have Homebrew, you can install it by entering this command in the terminal.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Before installing WebUI, you need to prepare the running environment. Open the terminal.
1. Install Python version 3.10 or above

2. Pull the WebUI code from the GitHub repository

Pulling the code allows SD to update in real-time more conveniently and use the latest features.

In any directory, run the following command:

After pulling, the directory looks like this:

CleanShot-2023-07-16-15-15-29@2x

3. Download the Stable Diffusion model
I downloaded the relatively new model, version 2.1. The common formats for models are: ckpt, safetensors. Download link: https://huggingface.co/stabilityai/stable-diffusion-2-1

CleanShot-2023-07-16-15-38-57@2x

CleanShot-2023-07-16-15-40-24@2x

Place the downloaded model in the stable-diffusion-webui/models/Stable-diffusion directory that you just pulled.

Since version 2.1 also requires a configuration file, the method to download the configuration file is:
Download link: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Installation-on-Apple-Silicon#downloading-stable-diffusion-models

Hold down the option key on the keyboard and click the mouse on here to download.
CleanShot-2023-07-16-15-46-01@2x

The downloaded file is named: v2-inference-v.yaml.
Then we need to rename this file to match the name of the downloaded model. My model name is v2-1_768-ema-pruned.ckpt, so the configuration file name needs to be changed to v2-1_768-ema-pruned.yaml.

CleanShot-2023-07-16-15-59-12@2x

4. Execute the script to run the web UI

During the execution of the script, it will automatically download the required dependency files. This may take a while, so please be patient; it may take anywhere from half an hour to 2 hours.

CleanShot-2023-07-16-16-09-54@2x

Once the access address appears, you have succeeded: http://127.0.0.1:7860/.
After success, do not close or stop the terminal; directly access this address in your browser: http://127.0.0.1:7860/

CleanShot-2023-07-16-16-58-02@2x

CleanShot-2023-07-16-17-04-10@2x

In the future, every time you open it, just execute the webui.sh script. If you want to update, just execute git pull in the root directory.

Special Case Handling#

When generating images, you may encounter the following error:

If you encounter the same error, you can resolve it as follows:
Open the webui-user.sh file in the root directory.
Modify the COMMANDLINE_ARGS parameter as follows:

CleanShot-2023-07-16-21-31-09@2x

Re-execute ./webui.sh.
Finally, check the option "Upcast cross attention layer to float32" in the settings under Stable Diffusion to run normally.

CleanShot-2023-07-16-21-34-06@2x

Create Your First Painting#

Generate Images from Text#

Before drawing, first get to know what each part of this interface is.

CleanShot-2023-07-16-17-15-13@2x

Several important parameters in the interface:

Sampling Steps: This parameter affects time and effect, usually set around 30, mainly controlling the degree of denoising.
Seed: Determines the content of the image, mainly affecting the random noise during the image iteration.
CFG Scale: Determines the artist's degree of freedom.
- 2 ~ 6: Random generation, basically not following the hints.
- 7 ~ 10: The most common setting, providing a good balance.
- 10 ~ 15: Requires the hints to be very good and specific, and above 10, the saturation will increase.

Generate Images from Images#

Generate images based on hints + images.

CleanShot-2023-07-16-17-33-57@2x
Similarly, let's take a look at some settings in this interface. Scroll down to the bottom of the page to find the settings area.
CleanShot-2023-07-16-17-50-33@2x

The most common use of image-to-image generation is: to change the style of an image.

Image Extension#

Generate images based on hints + MASK (mask) + images.
Common scenarios: removing watermarks, changing outfits, extending the edges of images.

How to Write Prompts#

Keywords#

Separate different characteristics with commas#

Separate similar characteristics with |#

Adjusting Weights#

If you want to adjust the proportion or weight of a certain characteristic in the image, you can do it like this:
(hint: weight value)

Value < 1: Weakens the weight.
Value > 1: Strengthens the weight.

Gradient Effect#

If you want the image to have a gradient, you can do it like this:
[keyword 1 2]

Alternating Fusion#

If you want one half of the image to have one style and the other half to have another style, you can do it like this:
[keyword|keyword]

Enhancing Effects#

Add high-quality keywords, such as: best quality, masterpiece.

You can see that the image becomes very vibrant.

Adding Reverse Words#

Common reverse words:

nsfw: Pornographic or violent.
bad face: Bad face.

Enhancing Form#

Control the emphasis on the overall form of the image: such as whether it is a full body shot or similar.

Lighting
- cinematic lighting (movie lighting)
- dynamic lighting (dynamic lighting)
Gaze
- looking at viewer
- looking at another
- looking away
- looking back
- looking up
Style
- sketch
Perspective
- dynamic angle
- from above
- from below
- wide shot

Combining with ChatGPT#

CleanShot-2023-07-16-22-52-49@2x

Advanced Play#

Different Models#

Common model search and download websites:
- huggingface
- civitai

Commonly used models in the market:

Anime style model
- Anything V3
Traditional Chinese style model
- GuoFeng3: A model with a gorgeous ancient Chinese style, with a 2.5D texture.
Midjourney style
- Dreamlike Diffusion 1.0: Colors are particularly vibrant, and the painting style is particularly flashy.
Realistic style
- Protogen x3.4
- Protogen x5.3

You can download these models and place them in the stable-diffusion-webui/models/Stable-diffusion directory. Click the refresh button next to the model selection on the page to use these models.

CleanShot-2023-07-17-00-12-22@2x

CleanShot-2023-07-17-00-18-28@2x

lMingyul