wanghaofan commited on
Commit
81e59fc
·
verified ·
1 Parent(s): 940382f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -3
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: diffusers
6
+ pipeline_tag: image-to-image
7
+ tags:
8
+ - Image-to-Image
9
+ - ControlNet
10
+ - Diffusers
11
+ - QwenImageControlNetInpaintPipeline
12
+ - Qwen-Image
13
+ base_model: Qwen/Qwen-Image
14
+ ---
15
+
16
+
17
+ # Qwen-Image-ControlNet-Inpainting
18
+ This repository provides a ControlNet that supports mask-based image inpainting and outpainting for [Qwen-Image](https://github.com/QwenLM/Qwen-Image).
19
+
20
+
21
+ # Model Cards
22
+ - This ControlNet consists of 6 double blocks copied from the pretrained transformer layers.
23
+ - We train the model from scratch for 65K steps using a dataset of 10M high-quality general and human images.
24
+ - We train at 1328x1328 resolution in BFloat16, batch size=128, learning rate=4e-5. We set the text drop ratio to 0.10.
25
+
26
+
27
+ # Showcases
28
+ <table style="width:100%; table-layout:fixed;">
29
+ <tr>
30
+ <td><img src="./assets/images/image1.png" alt="example1"></td>
31
+ <td><img src="./assets/masks/mask1.png" alt="example1"></td>
32
+ <td><img src="./assets/results/output1.png" alt="example1"></td>
33
+ </tr>
34
+ <tr>
35
+ <td><img src="./assets/images/image2.png" alt="example2"></td>
36
+ <td><img src="./assets/masks/mask2.png" alt="example2"></td>
37
+ <td><img src="./assets/results/output2.png" alt="example2"></td>
38
+ </tr>
39
+ <tr>
40
+ <td><img src="./assets/images/image3.png" alt="example3"></td>
41
+ <td><img src="./assets/masks/mask3.png" alt="example3"></td>
42
+ <td><img src="./assets/results/output3.png" alt="example3"></td>
43
+ </tr>
44
+ </table>
45
+
46
+ # Inference
47
+ ```python
48
+ import torch
49
+ from diffusers.utils import load_image
50
+
51
+ # pip install git+https://github.com/huggingface/diffusers
52
+ from diffusers import QwenImageControlNetModel, QwenImageControlNetInpaintPipeline
53
+
54
+ base_model = "Qwen/Qwen-Image"
55
+ controlnet_model = "InstantX/Qwen-Image-ControlNet-Inpainting"
56
+
57
+ controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
58
+
59
+ pipe = QwenImageControlNetPipeline.from_pretrained(
60
+ base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
61
+ )
62
+ pipe.to("cuda")
63
+
64
+ image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Inpainting/resolve/main/assets/images/image1.png")
65
+ mask_image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Inpainting/resolve/main/assets/masks/mask1.png")
66
+ prompt = "一辆绿色的出租车行驶在路上"
67
+
68
+ image = pipe(
69
+ prompt=prompt,
70
+ negative_prompt=" ",
71
+ control_image=image,
72
+ control_mask=mask_image,
73
+ controlnet_conditioning_scale=controlnet_conditioning_scale,
74
+ width=control_image.size[0],
75
+ height=control_image.size[1],
76
+ num_inference_steps=30,
77
+ true_cfg_scale=4.0,
78
+ generator=torch.Generator(device="cuda").manual_seed(42),
79
+ ).images[0]
80
+ image.save(f"qwenimage_cn_inpaint_result.png")
81
+ ```
82
+
83
+ # Limitations
84
+ This model is slightly sensitive to user prompts. Using detailed prompts that describe the entire image (both the inpainted area and the background) is highly recommended. Please use descriptive prompt instead of instructive prompt.
85
+
86
+ # Acknowledgements
87
+ This model is developed by InstantX Team. All copyright reserved.