A Guide to Creating Hyper-Realistic AI Images with Stable Diffusion

Introduction to Hyper-Realistic AI Images

Specialized models have transformed Stable Diffusion into a powerful tool for generating photorealistic art. If you’re fascinated by the merging of reality and AI-generated art, then you’re likely already familiar with Stable Diffusion. This open-source AI platform has revolutionised creative expression, enabling artists to explore their boundaries on their personal computers for free. With simple prompts, you can generate picturesque landscapes, fantasy illustrations, 3D creatures, or cartoons. However, the real magic lies in its ability to produce hyper-realistic images.

Prompting for Realism

Creating hyper-realistic images requires precise prompting and text embeddings. We conducted our tests using a consistent positive prompt to generate high-quality images and a negative prompt to instruct the AI on what to avoid. Our positive prompt was: “professional photo, closeup portrait photo of caucasian man, wearing a black sweater, serious face, dramatic lighting, nature, gloomy, cloudy weather, bokeh”. The negative prompt included various instructions to avoid unrealistic and low-quality elements.

Stable Diffusion 1.5: The Ageless Veteran

Stable Diffusion 1.5 has stood the test of time, much like a classic muscle car that outperforms its modern counterparts. Despite the newer SDXL versions, many users still prefer SD 1.5 for generating images that are nearly indistinguishable from real-life photos. Here are some specialised models that excel in this realm:

Juggernaut Rborn: Known for its realistic colour composition and ability to differentiate between subjects and backgrounds, Juggernaut Rborn excels in generating detailed skin textures, hair, and bokeh effects. For optimal results, use the DPM++ 2M Karras sampler at around 35 steps and an average CFG scale of 7.
Realistic Vision v5.1: A leader in photorealistic image generation, Realistic Vision v5.1 captures facial expressions and imperfections with remarkable accuracy. Preferred for its performance and versatility, it focuses on the subject rather than the background. Despite the availability of v6.0, many still favour v5.1 for its nuanced details in skin, hair, and nails.
I Can’t Believe It’s Not Photography: This versatile model excels in various lighting conditions and at different resolutions, from 640×960 to 768×1152. For best results, use the DPM++ 3M SDE Karras or DPM++ 2M Karras sampler with 20-30 steps and a CFG scale of 2.5-5.

Stable Diffusion XL: Expanding Horizons

While Stable Diffusion 1.5 is a top choice for photorealism, Stable Diffusion XL offers greater versatility without needing extra tools like upscaling. Requiring less computing power, here are the models leading the charge in this category:

Juggernaut XL (Version x): Building on its predecessor’s success, Juggernaut XL offers cinematic aesthetics and impressive subject focus. Use a resolution of 832×1216 for portraits, the DPM++ 2M Karras sampler with 30-40 steps, and a low CFG scale of 3-7 for best results.
RealVisXL: Designed with realism in mind, RealVisXL is adept at capturing skin lines, moles, and tone variations, making it ideal for generating realistic human images. For optimal results, use 15-30+ sampling steps and the DPM++ 2M Karras sampling method.
HelloWorld XL v6.0: This generalistic model employs GPT4v tagging and excels in producing images with an analogue aesthetic. While it may require some adjustment in prompting, it delivers excellent body proportions and lighting effects.

Expert Tips for Hyper-Realistic Images

To achieve the most convincing results, consider the following expert tips:

Experiment with embeddings: Use recommended embeddings or popular ones like BadDream, UnrealisticDream, FastNegativeV2, and JuggernautNegative-neg to enhance image aesthetics.
Utilise LoRAs: These tools can help add details, adjust lighting, and enhance skin textures. Don’t hesitate to experiment with various LoRAs.
Use face detailing tools: Extensions like Adetailer for A1111 or the Face Detailer Pipe node in ComfyUI can improve face and hand details.
Get creative with ControlNets: Ideal for achieving flawless hands and other detailed features, ControlNets can be experimented with for various subjects.

Conclusion

We hope this guide has provided valuable insights into creating hyper-realistic images using Stable Diffusion. Equipped with these specialised models and expert tips, you’re ready to explore the potential of AI-generated photorealism. Happy creating!