Instructions to use comin/IterComp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use comin/IterComp with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("comin/IterComp", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| license: apache-2.0 | |
| # IterComp(ICLR 2025) | |
| Official Repository of the paper: *[IterComp](https://arxiv.org/abs/2410.07171)*. | |
| <p align="left"> | |
| <a href='https://arxiv.org/abs/2410.07171'> | |
| <img src='https://img.shields.io/badge/Arxiv-2410.07171-A42C25?style=flat&logo=arXiv&logoColor=A42C25'></a> | |
| <a href='https://github.com/YangLing0818/IterComp'> | |
| <img src='https://img.shields.io/badge/GitHub-Code-black?style=flat&logo=github&logoColor=white'></a> | |
| </p> | |
| <img src="./itercomp.png" style="zoom:50%;" /> | |
| ## News🔥🔥🔥 | |
| **[2025.02]** We open-source three composition-aware reward models in [HuggingFace Repo](https://huggingface.co/comin/IterComp/tree/main/reward_models), which can be used for preference learning and as **new image generation evaluators**. | |
| **[2025.02]** We enhance IterComp-RPG with LLMs that possess the strongest reasoning capabilities, including [**DeepSeek-R1**](https://github.com/deepseek-ai/DeepSeek-R1), [**OpenAI o3-mini**](https://openai.com/index/openai-o3-mini/), and [**OpenAI o1**](https://openai.com/index/learning-to-reason-with-llms/) to achieve outstanding compositional image generation under complex prompts. | |
| **[2025.01]** IterComp is accepted by ICLR 2025!!! | |
| **[2024.10]** Checkpoints of base diffusion model are publicly available on [HuggingFace Repo](https://huggingface.co/comin/IterComp). | |
| **[2024.10]** Our main code of IterComp is released. | |
| ## Introduction | |
| IterComp is one of the new State-of-the-Art compositional generation methods. In this repository, we release the model training from [SDXL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) . | |
| ## Text-to-Image Usage | |
| ```python | |
| from diffusers import DiffusionPipeline | |
| import torch | |
| pipe = DiffusionPipeline.from_pretrained("comin/IterComp", torch_dtype=torch.float16, use_safetensors=True) | |
| pipe.to("cuda") | |
| # if using torch < 2.0 | |
| # pipe.enable_xformers_memory_efficient_attention() | |
| prompt = "An astronaut riding a green horse" | |
| image = pipe(prompt=prompt).images[0] | |
| image.save("output.png") | |
| ``` | |
| IterComp can **serve as a powerful backbone for various compositional generation methods**, such as [RPG](https://github.com/YangLing0818/RPG-DiffusionMaster) and [Omost](https://github.com/lllyasviel/Omost). We recommend integrating IterComp into these approaches to achieve more advanced compositional generation results. | |
| ## Citation | |
| ``` | |
| @article{zhang2024itercomp, | |
| title={IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation}, | |
| author={Zhang, Xinchen and Yang, Ling and Li, Guohao and Cai, Yaqi and Xie, Jiake and Tang, Yong and Yang, Yujiu and Wang, Mengdi and Cui, Bin}, | |
| journal={arXiv preprint arXiv:2410.07171}, | |
| year={2024} | |
| } | |
| ``` | |
| ## | |