YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Texture-Preserving Multimodal Fashion Image Editing with Diffusion Models

🎯 Overview

TP-MGD is a new method for texture-preserving multimodal fashion image editing using diffusion models. The project enables high-quality fashion image generation and editing through an innovative lightweight architecture setup while maintaining fine-grained texture details.

✅ TODO

Release training code
Release inference code
Release processed datasets
Release checkpoints to Hugging Face
Create comprehensive documentation

🚀 Quick Start

Installation

git clone https://github.com/zibingo/TP-MGD.git
cd TP-MGD

Requirements:

Python 3.9+
PyTorch >= 2.5.0
CUDA >= 12.4

pip install diffusers accelerate transformers opencv-python einops wandb open_clip_torch

Download Pre-trained Models

wget https://huggingface.co/h94/IP-Adapter/resolve/main/models/ip-adapter_sd15.bin

📊 Dataset Setup

VITON-HD Dataset

Download VITON-HD: Get the original dataset from VITON-HD
Download MGD multimodal data: Get additional data from MGD

Download preprocessed textures:

wget https://huggingface.co/zibingo/TP-MGD/resolve/main/vitonhd-texture.zip

Configuration: Set the dataroot_path in the YAML files under the configs/ directory.

Directory Structure:

├── captions.json                (from MGD)
├── test/
│   ├── agnostic-mask/
│   ├── agnostic-v3.2/
│   ├── cloth/
│   ├── cloth-mask/
│   ├── cloth-texture/           (from Ours)
│   ├── im_sketch/               (from MGD)
│   ├── im_sketch_unpaired/      (from MGD)
│   ├── image/
│   ├── image-densepose/
│   ├── image-parse-agnostic-v3.2/
│   ├── image-parse-v3/
│   ├── openpose_img/
│   └── openpose_json/
├── test_pairs.txt
├── train/
│   ├── agnostic-mask/
│   ├── agnostic-v3.2/
│   ├── cloth/
│   ├── cloth-mask/
│   ├── cloth-texture/           (from Ours)
│   ├── gt_cloth_warped_mask/
│   ├── im_sketch/               (from MGD)
│   ├── image/
│   ├── image-densepose/
│   ├── image-parse-agnostic-v3.2/
│   ├── image-parse-v3/
│   ├── openpose_img/
│   └── openpose_json/
└── train_pairs.txt

DressCode Dataset

Download DressCode: Get the original dataset from DressCode
Download MGD multimodal data: Get additional data from MGD

Download preprocessed textures:

wget https://huggingface.co/zibingo/TP-MGD/resolve/main/dresscode-texture.zip

Configuration: Set the dataroot_path in the YAML files under the configs/ directory.

Directory Structure:

├── dresses/
│   ├── dense/
│   ├── dresses_cloth-texture/    (from Ours)
│   ├── im_sketch/                (from MGD)
│   ├── im_sketch_unpaired/       (from MGD)
│   ├── images/
│   ├── keypoints/
│   ├── label_maps/
│   ├── test_pairs_paired.txt
│   ├── test_pairs_unpaired.txt
│   └── train_pairs.txt
├── lower_body/
│   ├── dense/
│   ├── im_sketch/                (from MGD)
│   ├── im_sketch_unpaired/       (from MGD)
│   ├── images/
│   ├── keypoints/
│   ├── label_maps/
│   ├── lower_body_cloth-texture/ (from Ours)
│   ├── test_pairs_paired.txt
│   ├── test_pairs_unpaired.txt
│   └── train_pairs.txt
├── upper_body/
│   ├── dense/
│   ├── im_sketch/                (from MGD)
│   ├── im_sketch_unpaired/       (from MGD)
│   ├── images/
│   ├── keypoints/
│   ├── label_maps/
│   ├── test_pairs_paired.txt
│   ├── test_pairs_unpaired.txt
│   ├── train_pairs.txt
│   └── upper_body_cloth-texture/  (from Ours)
├── coarse_captions.json           (from MGD)
├── fine_captions.json             (from MGD)
├── multigarment_test_triplets.txt
├── readme.txt
├── test_pairs_paired.txt
├── test_pairs_unpaired.txt
├── test_stitch_map/               (from MGD)
└── train_pairs.txt

🚀 Usage

Training

Single GPU:

python train_vitonhd.py
python train_dresscode.py

Multi-GPU

CUDA_VISIBLE_DEVICES=0,1 accelerate launch train_vitonhd.py
CUDA_VISIBLE_DEVICES=0,1 accelerate launch train_dresscode.py

Inference

Download pre-trained weights from Hugging Face and place them in the checkpoints/ directory
Update configuration: Modify the resume_state parameter in the YAML files under configs/ directory to point to your checkpoint directory

Single GPU:

python inference_vitonhd.py
python inference_dresscode.py

Multi-GPU:

CUDA_VISIBLE_DEVICES=0,1 accelerate launch inference_vitonhd.py
CUDA_VISIBLE_DEVICES=0,1 accelerate launch inference_dresscode.py

📁 Project Structure

TP-MGD/
├── configs/                 # Configuration files
├── checkpoints/             # Pre-trained model weights
├── assets/                  # Sample images
├── train_vitonhd.py         # VITON-HD training script
├── train_dresscode.py       # DressCode training script
├── inference_vitonhd.py     # VITON-HD inference script
├── inference_dresscode.py   # DressCode inference script
├── datasets.py              # Dataset loading utilities
└── attention_processor.py   # Custom attention mechanisms

🔧 Configuration

Key configuration parameters in configs/*.yaml:

dataroot_path: Path to your dataset
resume_state: Path to checkpoint for inference or resume train

🙏 Acknowledgments

Our code is based on Diffusers
We use Stable Diffusion v1.5 inpainting as the base model
Thanks to VITON-HD, DressCode, and MGD for providing the public datasets

📞 Contact

For questions and support, please open an issue on GitHub or contact the authors.

⭐ If you find this project helpful, please give it a star!

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support