DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

This repository contains the DraCo model, presented in the paper DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation.

DraCo proposes a novel interleaved reasoning paradigm that fully leverages both textual and visual contents in Chain-of-Thought (CoT) for better planning and verification in text-to-image generation. This method first generates a low-resolution draft image as a preview, providing concrete visual planning and guidance. It then verifies potential semantic misalignments between the draft and input prompt, performing refinement through selective corrections with super-resolution.

Code

The official implementation can be found on GitHub: https://github.com/CaraJ7/DraCo (Note: The GitHub repository indicates that the code is "coming soon.")

Citation

If you find DraCo useful for your research, please consider citing the paper:

@article{jiang2025draco,
      title={DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation},
      author={Jiang, Dongzhi and Zhang, Renrui and Li, Haodong and Zong, Zhuofan and Guo, Ziyu and He, Jun and Guo, Claire and Ye, Junyan and Fang, Rongyao and Li, Weijia and Liu, Rui and Li, Hongsheng},
      year={2025},
      eprint={2512.05112},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.05112},
}

Downloads last month: -; Downloads are not tracked for this model. How to track