ControlNet

What is ControlNet

If you have been using Stable diffusion for a while, you would have known how difficult it is to generate images with precise composition or desired poses. ControlNet a Stable diffusion model lets users control how placement and appearance of images are generated.

In this guide, we will learn how to install and use ControlNet models in Automatic1111.

Install ControlNet in Automatic1111

Below are the steps to install ControlNet in Automatic1111 stable-diffusion-webui.

  • Navigate to the Extensions tab in Automatic1111

  • Click Install from URL tab, copy and paste the below URL to "URL for extension's git repository".

https://github.com/Mikubill/sd-webui-controlnet.git

Press Install button.

Controlnet image1

tip

  • We have observed that after you click install, you may not see any progress bar. Wait for few seconds/minutes till you see this message. Installed into /home/stable-diffusion-webui/extensions/sd-webui-controlnet. Use Installed tab to restart..

    Controlnet image2

Go to "Installed" tab, click "Check for updates", and then click "Apply and restart UI".

Controlnet image3

If the extension is successfully installed, you will see a collapsible section in the txt2img tab by name ControlNet right above the Script drop-down menu.

Controlnet image3

By clicking on the expansion, it looks as below.

Controlnet image3

Lets download few ControlNet model weights, to create some awesome images in Automatic1111.

Download ControlNet model weights

We have listed the 13 ControlNet models available in Hugging Face below with their links. You can copy the link to the required model and use wget to download the model weights.

ModelsLink
Cannyhttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth
Depthhttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1p_sd15_depth.pth
Normalhttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_normalbae.pth
OpenPosehttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_openpose.pth
MLSDhttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_mlsd.pth
Linearthttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_lineart.pth
SoftEdgehttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_softedge.pth
Scribblehttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_scribble.pth
Seghttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_seg.pth
Shufflehttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11e_sd15_shuffle.pth
Tilehttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1e_sd15_tile.pth
Inpainthttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_inpaint.pth
IP2Phttps://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11e_sd15_ip2p.pth

You can also check here for any updated models.

  • To download the weights, use the following command.
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth
  • After downloading the weights, use the following command to move them to the required Automatic1111 folder.
mv *.pth /home/stable-diffusion-webui/extensions/sd-webui-controlnet/models/

You are all set to use the ControlNet extension in the Automatic1111 webui.

Using ControlNet – a simple example

Now ControlNet is installed and required weights are downloaded and placed them in the right path.Let's go through a simple example of generating image using canny edge.

You should have the ControlNet extension installed to follow this section. You can verify by seeing the ControlNet section below.

Controlnet image3

Text-to-image settings:

ControlNet need to be used with a Stable Diffusion model. In Stable Diffusion checkpoint dropdown menu select the model you want to use with ControlNet.

Select deliberate_v2.safetensors to select Deliberate model.

Controlnet image1

In txt2img tab, pass the prompt and negative prompt(optional) to be used by ControlNet.Below are the prompts i will be using.

Prompt:

a cute cat in the garden,a masterpiece

Negative Prompt:

disfigured,ugly

Set the image settings like height, width and other settings.

Controlnet image1

ControlNet settings:

Now, lets move on to the ControlNet settings.Upload the image in the image canvas.

Check ✅ the Enable checkbox, Pixel Perfect, Allow preview.

Select a specific model in the Control Type, the corresponding model and its related Pre-processors will be automatically filtered and loaded into the respective dropdown, along with other default setting options.

By clicking on Run Preprocessor 💥 , you will be able to see a preview of the input image after it has been processed.

Controlnet image1

Now, Click on the Generate button to start generating images using the ControlNet.

Finally, the GUI looks as below.

Controlnet image1

note

When you are done, uncheck the Enable checkbox to disable the ControlNet extension

Canny

Canny, a classic edge detector, utilizes multi-stage algorithms to extract outlines from input images.It performs preprocessing to maintain the original composition of the input image. It then generates an output image that includes the corresponding outlines.

Controlnet image1

Depth

Depth, pre-process an input to a grayscale image with black representing deep areas and white representing shallow areas.

Controlnet image1

There are multiple preprocessors available in depth model.

  • depth_midas
  • depth_leres
  • depth_leres++
  • depth_zoe

Below are the images that have used the depth preprocessors to generate a woman cop image based on the given prompt, with the input image being 'milkman'.

Controlnet image1

OpenPose

Openpose detects human key points like head,shoulders,legs,etc.It is useful for copying the human poses.In simple it is the skeleton view of the image.

The OpenPose preprocessors

  • OpenPose
  • OpenPose_face
  • OpenPose_faceonly
  • OpenPose_hand
  • OpenPose_full

Controlnet image1

OpenPose

OpenPose serves as the base preprocessor that detects various human body parts such as the head, hands, legs, nose, ears, knees, ankles, shoulders, etc., from the provided input image.

It then generates an output image with the same pose.

Controlnet image1

OpenPose_face

OpenPose_face performs all the essential functions of the base preprocessor and extends its capabilities by detecting facial expressions.

Controlnet image1

OpenPose_faceonly

OpenPose_faceonly specializes in detecting facial expressions while excluding other key points. This feature is particularly useful for capturing and replicating facial expressions.

Controlnet image1

OpenPose_hand

In addition to base preprocessor, OpenPose_hand detects the key points of hands and fingers.

Controlnet image1

OpenPose_full

OpenPose_full detects everything OpenPose_face and OpenPose_hand do.

Controlnet image1

MLSD

MLSD is a straight-line detector used to detect straight lines and edges. This preprocessor is particularly useful for architectural elements like room interiors, streets, frames,etc.Any curves will be ignored.

Below is the image that have used the MLSD preprocessors to generate a classic room image with input image of study room.

Controlnet image1

Scribble

Scribble preprocessors turns the image to a scribble, similar to the one drawn by hand.

The available scribble preprocessors are

  • Scribble HED : This is base preprocessor and suitable for recoloring and restyling of image
  • Scribble Pidinet: This preprocessor detects curves and straight edges in addition to the base preprocessor and results more clear lines with details.
  • Scribble XDoG: This is an edge detection technique.Need to tune the XDoG threshold and observe the output.

Below are the images that have been generated using scribble multiple Pre-processors to generate the images.

Controlnet image1

Seg

Segmentation labels the type of objects in the input image.These are used to replicate the shape of objects in the generated images.

The avaialble Segmentation preprocessors are

  • seg_ofade20k
  • seg_ofcoco
  • seg_ufade20k

Below are the images that have been generated using the Seg Preprocessor.

Controlnet image1

Normal

Normal map provides the information about the orientation of a surface in the image.Image pixels represents the direction in which the surface is facing instead of color values.This is used to replicate the 3d composition of the given image.

The Normal map preprocessors

  • Normal Midas: The Midas preprocessor is good for isolating subject form the background.
  • Normal Bae: The Bae preprocessor is used to render details of both subject and background.

Below are the images that have been generated using the Normal map Preprocessor.

Controlnet image1

Lineart

Lineart analyses the image and generate a black and white sketch that appears as scanned images.

The Lineart preprocessors are

  • lineart_anime: Analyze the anime images and allows coloring on top of it.
  • lineart_anime_denoise: Analyze the anime images with more image details.
  • lineart_realistic: Analyze the images with realistic-style lines.
  • lineart_coarse: Analyze the images with realistic-style lines and more image details.

Below are the images that have been generated using

Controlnet image1

Controlnet image1

References

Previous
LoRa