This page contains some tips on how to use AI image generation and manipulation tools appropriately in making content for HICU-RPG purposes. It is not a binding policy, but rather advice for how to make more effective use of these tools to supplement or enhance human creativity and help to avoid common problems with AI content.
Here is a list of methods and techniques that can be helpful for capturing the user's creative intent when using an AI tool:
- It is useful to express one’s intention in advance of beginning image generation using such methods as a written description (not in “prompt language”), a preparatory sketch, or photobash. This helps to preserve the creative expression of the human. In some cases these things can serve as inputs for the AI, too.
Using your directly-created sketch or other preliminary human-made image as an input to the AI. If you wish to use a sketch, the ControlNet sketch is the best option for Stable Diffusion. If you do not have access to ControlNet, you can use image-to-image on a sketch, but you should probably color it in first, as if you set the denoising power high enough to directly colorize a sketch with image-to-image, few details will be preserved.
ControlNet scribble can turn a low-quality sketch into an image, though only the approximate composition is conveyed; ControlNet sketch / line art are more flexible. You can use also use image-to-image to refine a crude sketch into a better one iteratively since SD will capture intended symmetry, perspective etc, and then use the refined sketch with the more-precise sketch ControlNet.
3D OpenPose ControlNet allows one to pose a humanoid figure and then render it according to a prompt. This is very useful because poses are hard to prompt. You can also use the openpose preprocessor to extract a pose from, e.g, a photo of a model or yourself that you took, which you may prefer. You can also make a simple model of your scene in a 3D modeling program and render a depth map to use with a ControlNet.
- Using regional prompting allows you to create more sophisticated prompts and make sure details specified in the prompt get associated with the correct thing. This is almost essential if making an image with two or more characters in it.
Edit the image with a conventional image editor, either as a final retouching stage or in between image-to-image passes. Consider, for example, starting with a sketch ControlNet, then editing the image between several image-to-image passes with decreasing denoising power and increasing resolution, as you refine the work iteratively. You do not have to be particularly meticulous with these edits because even low-noise image-to-image passes will render them as well as the rest of the image. Colors and shapes are more important than precise linework.
- Use collage techniques to bring in imagery from multiple generations into one composite work. If you prompt a "plain background" or "solid color background", most SD checkpoints will put the subject on a simple background that can be easily removed with the magic wand or color select tool. Don't forget to check that the lighting is at least vaguely consistent, though.
- Use inpainting to add details to selected areas of the image, especially important background elements. You can do the inpaint generation at full resolution by using "masked area only" instead of "whole image" inpainting. A subsequent low-noise image-to-image pass or manual editing may or may not be needed to facilitate blending with the rest of the work.
- Use AI to make textures, skyboxes or "matte paintings" in a 3D modeling / rendering workflow.
- Choose your prompt words carefully. If you go over the token limit (75 for Stable Diffusion), your prompt will be "diluted" significantly. Increasing CFG strength can help to compensate, but too much will make the image conform to the prompt locally while losing whole-image cohesion. Minimize "filler" prompt words like "best quality" or "masterpiece", though a single one may be helpful, especially in models such as Pony which include quality labels extensively.
Additionally, here are some suggestions about things to do to improve the quality of your work using AI tools:
- Carefully go over every part of the image. Don't just look at the main subject. What is going on the the background? Are there "goblins", "friends" or unidentifiable objects? ("Goblins" are deformed humanoid figures that commonly appear unbidden in the background of generations, and sometimes get prompt elements unloaded onto them instead of the main subject. "Friends" are usually smaller, less prominent versions of the main subject. Occasionally you will see a full-fledged "twin," where the character is drawn more than once, perhaps in the style of a modelsheet or comic.)
- Fix common "AI vices" such as misshapen pupils or hands. These can be inpainted ("rolling the dice again" on that area of the image) or fixed with an image editor directly.
- Check for desaturation, color burning and other artifacts. Desaturation can be fixed with "Levels" in an image editor. Burning can be fixed by using the correct variational autoencoder (VAE) if self-hosting, or by brushing over it in an image editor. You can also mirror the image left-to-right and run an image-to-image pass, or use a different checkpoint.
- Make sure inanimate objects and clothing "make sense" and could work properly in-universe. How do you put on / take off the clothing? How does it not fall off or slip?
- Review for consistency with previous descriptions and depictions. Try to be accurate rather than just making a generic image that "looks good" but has only a rough similarity to the character or thing portrayed.
- If you want to render the same subject multiple times, consider training an embedding or LoRA. An embedding is good for characters that can mostly be drawn well by the base model already, but you want to render consistently with a single prompt word. A LoRA is able to capture novel features (scars, unusual clothing, uncommon body shapes). Both require sample images for training. You can train from carefully curated AI generations, though it is especially important to make sure they don't contain "AI vices" as these will be amplified.
Useful free programs for art, with or without AI: GIMP (similar to Photoshop), Kirta (has natural media tools and good stable diffusion integration), Blender (3D, has stable diffusion integration)
AI programs: Automatic1111, ComfyUI (stable diffusion interfaces), Kohya_ss (for training LoRAs). Automatic1111 can also train embeddings and some kinds of hypernetworks.
Share prompts for Avatar characters here.
