SD3-ControlNet-Depth

Demo

import torch
from diffusers import StableDiffusion3ControlNetPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
from diffusers.utils import load_image

# load pipeline
controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Depth")
pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    controlnet=controlnet
)
pipe.to("cuda", torch.float16)

# config
control_image = load_image("https://huggingface.co/InstantX/SD3-Controlnet-Depth/resolve/main/images/depth.jpeg")
prompt = "a panda cub, captured in a close-up, in forest, is perched on a tree trunk. good composition, Photography, the cub's ears, a fluffy black, are tucked behind its head, adding a touch of whimsy to its appearance. a lush tapestry of green leaves in the background. depth of field, National Geographic"
n_prompt = "bad hands, blurry, NSFW, nude, naked, porn, ugly, bad quality, worst quality"

# to reproduce result in our example
generator = torch.Generator(device="cpu").manual_seed(4000)
image = pipe(
    prompt, 
    negative_prompt=n_prompt, 
    control_image=control_image, 
    controlnet_conditioning_scale=0.5,
    guidance_scale=7.0,
    generator=generator
).images[0]
image.save('image.jpg')

Limitation

Due to the fact that only 1024*1024 pixel resolution was used during the training phase, the inference performs best at this size, with other sizes yielding suboptimal results.

Downloads last month
110
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.