Papers
arxiv:2405.14430

PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference

Published on May 23, 2024
Authors:
,
,
,
,

Abstract

PipeFusion is a parallel strategy for diffusion transformer models that partitions images and model layers across GPUs to reduce latency and communication costs, improving memory efficiency and performance.

AI-generated summary

This paper presents PipeFusion, an innovative parallel methodology to tackle the high latency issues associated with generating high-resolution images using diffusion transformers (DiTs) models. PipeFusion partitions images into patches and the model layers across multiple GPUs. It employs a patch-level pipeline parallel strategy to orchestrate communication and computation efficiently. By capitalizing on the high similarity between inputs from successive diffusion steps, PipeFusion reuses one-step stale feature maps to provide context for the current pipeline step. This approach notably reduces communication costs compared to existing DiTs inference parallelism, including tensor parallel, sequence parallel and DistriFusion. PipeFusion enhances memory efficiency through parameter distribution across devices, ideal for large DiTs like Flux.1. Experimental results demonstrate that PipeFusion achieves state-of-the-art performance on 8timesL40 PCIe GPUs for Pixart, Stable-Diffusion 3, and Flux.1 models. Our source code is available at https://github.com/xdit-project/xDiT.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2405.14430 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2405.14430 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2405.14430 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.