How to generate realistic people in Stable Diffusion
The tutorial focuses on creating lifelike portrait images using Stable Diffusion. It covers prompts, lighting, facial details, blending faces, poses, and models like F222 and Hassan Blend 1.4 for realistic results. Emphasis on clothing terms and model licenses is highlighted.
Read original articleIn the tutorial on generating realistic people using Stable Diffusion, the focus is on creating lifelike portrait images. The process involves building high-quality prompts, incorporating lighting and camera keywords, and enhancing facial details for a more realistic appearance. Techniques like blending faces, controlling poses, and inpainting are discussed to refine the generated images. Additionally, specific models like F222, Hassan Blend 1.4, Realistic Vision v2.0, Chillout Mix, Dreamlike Photoreal, and URPM are introduced for generating realistic images with varying features and styles. The tutorial emphasizes the importance of using clothing terms to avoid explicit content and highlights the need to read and adhere to model licenses. Readers are encouraged to experiment with different models and techniques to achieve desired results in generating realistic people through Stable Diffusion.
Related
Unique3D: Image-to-3D Generation from a Single Image
The GitHub repository hosts Unique3D, offering efficient 3D mesh generation from a single image. It includes author details, project specifics, setup guides for Linux and Windows, an interactive demo, ComfyUI, tips, acknowledgements, collaborations, and citations.
Show HN: Feedback on Sketch Colourisation
The GitHub repository contains SketchDeco, a project for colorizing black and white sketches without training. It includes setup instructions, usage guidelines, acknowledgments, and future plans. Users can seek support if needed.
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
SceneCraft is an advanced Large Language Model (LLM) Agent converting text to 3D scenes in Blender. It excels in spatial planning, asset arrangement, and scene refinement, surpassing other LLM agents in performance and human feedback.
HybridNeRF: Efficient Neural Rendering
HybridNeRF combines surface and volumetric representations for efficient neural rendering, achieving 15-30% error rate improvement over baselines. It enables real-time framerates of 36 FPS at 2K×2K resolutions, outperforming VR-NeRF in quality and speed on various datasets.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.
Wouldn't these kinds of negative prompts and tweaking break down if I wanted to plug in more varied descriptions of people?
I find it interesting to plug in colorful descriptions of person's traits from a novel for example, or of people actually doing something.
Using "ugly", "disfigured" as negative prompt probably wouldn't work then...
For the pictures in the article, my first association is someone generating romance scam profile pictures, not art.
While people have tried going from a base model to a fine tuned model based on explicit images, I wonder if there are people are attempting to go the other way round (train a base model on explicit photographs and other images not involving humans; then fine-tune away the explicit parts), which might lead to better results?
I like how even with all the "please don't make it porn" terms in the prompt, you can easily see (by choice of dresses, cleavage, pose, facial expressions etc) which models "want" to generate porn and are barely held back by the prompt.
When one asks prompts for which it hasn’t seen in training data, the results start to look less realistic.
Have even seen adult video logos in generated images.
I very much strongly suspect AI is not what we think.
Eking out something "interesting" is difficult, especially with limited time and low-end hardware. Interesting is highly subjective of course. I tend towards the more artistic / surrealist style, usually NSFW. Only nudes, no pornography.
I've been experimenting these last few months with interesting generating images, trying to make them "artistic" rather than photo-realistic, or the usual bland anime tributes.
I usually pick a "classical" artist which already has nudes in their repertoire, and try to blend their style with some photos I take myself, and with the style of other artists.
Most fall flat, some come close to what I consider acceptable, but still have major flaws. However, due to my time and hardware constraints they're good enough to post. I use fooocus which is kind of limiting, but after trying and failing to produce satisfactory results with Automatic, fooocus is just what I needed.
I can't really understand why more people don't do the same. Stable Diffusion was trained on a long and diverse list of artists, but most people seem to disregard that and focus only on anime or realistic photographs. The internet is inundated with those. I'm following some people on Mastodon who post more interesting stuff, but they usually tend to be all same-ish. I try to produce more diverse stuff, but most of the time it feels like going against the grain.
The women still tend to look like unrealistic supermodels. Sometimes this is what I want. Sometimes not, and it takes many tweaks to make them normal women, and usually I can't spare the time. Which is unfortunate.
If anyone's interested, I post the somewhat better experiments in:
https://mastodon.social/@TheNudeSurrealist
Warning: Most are NSFW. But are NSFW in the way Titian's Venus, say, is NSFW.
How come this technology appears to be exclusively used to generate fake pictures of unrealistically good-looking women? And to what end..?
I don't know if it was me misconfiguring it, or if the images in post were really cherry-picked.
You need to simulate poor lighting, dirt, soul, realistic beauty etc. Perhaps even situations that give a reason for a photo to be taken other than I’m a basic heteronormative woman who is attractive.
Actually it is in no single image in that blog post.
If you have a trained eye that is.
Related
Unique3D: Image-to-3D Generation from a Single Image
The GitHub repository hosts Unique3D, offering efficient 3D mesh generation from a single image. It includes author details, project specifics, setup guides for Linux and Windows, an interactive demo, ComfyUI, tips, acknowledgements, collaborations, and citations.
Show HN: Feedback on Sketch Colourisation
The GitHub repository contains SketchDeco, a project for colorizing black and white sketches without training. It includes setup instructions, usage guidelines, acknowledgments, and future plans. Users can seek support if needed.
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
SceneCraft is an advanced Large Language Model (LLM) Agent converting text to 3D scenes in Blender. It excels in spatial planning, asset arrangement, and scene refinement, surpassing other LLM agents in performance and human feedback.
HybridNeRF: Efficient Neural Rendering
HybridNeRF combines surface and volumetric representations for efficient neural rendering, achieving 15-30% error rate improvement over baselines. It enables real-time framerates of 36 FPS at 2K×2K resolutions, outperforming VR-NeRF in quality and speed on various datasets.
Homegrown Rendering with Rust
Embark Studios develops a creative platform for user-generated content, emphasizing gameplay over graphics. They leverage Rust for 3D rendering, introducing the experimental "kajiya" renderer for learning purposes. The team aims to simplify rendering for user-generated content, utilizing Vulkan API and Rust's versatility for GPU programming. They seek to enhance Rust's ecosystem for GPU programming.