IterComp(ICLR 2025)

Official Repository of the paper: IterComp.

News🔥🔥🔥

[2025.02] We open-source three composition-aware reward models in HuggingFace Repo, which can be used for preference learning and as new image generation evaluators.

[2025.02] We enhance IterComp-RPG with LLMs that possess the strongest reasoning capabilities, including DeepSeek-R1, OpenAI o3-mini, and OpenAI o1 to achieve outstanding compositional image generation under complex prompts.

[2025.01] IterComp is accepted by ICLR 2025!!!

[2024.10] Checkpoints of base diffusion model are publicly available on HuggingFace Repo.

[2024.10] Our main code of IterComp is released.

Introduction

IterComp is one of the new State-of-the-Art compositional generation methods. In this repository, we release the model training from SDXL Base 1.0 .

Text-to-Image Usage

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("comin/IterComp", torch_dtype=torch.float16, use_safetensors=True)
pipe.to("cuda")
# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

prompt = "An astronaut riding a green horse"
image = pipe(prompt=prompt).images[0]
image.save("output.png")

IterComp can serve as a powerful backbone for various compositional generation methods, such as RPG and Omost. We recommend integrating IterComp into these approaches to achieve more advanced compositional generation results.

Citation

@article{zhang2024itercomp,
  title={IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation},
  author={Zhang, Xinchen and Yang, Ling and Li, Guohao and Cai, Yaqi and Xie, Jiake and  Tang, Yong and Yang, Yujiu and Wang, Mengdi and Cui, Bin},
  journal={arXiv preprint arXiv:2410.07171},
  year={2024}
}

comin
/

IterComp

IterComp(ICLR 2025)

News🔥🔥🔥

Introduction

Text-to-Image Usage

Citation

Model tree for comin/IterComp

Spaces using comin/IterComp 6