atedm.github.io - AT-EDM: Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Description: We introduce the Attention-driven Training-free Efficient Diffusion Model (AT-EDM), a framework that leverages attention maps to perform run-time pruning of redundant tokens during inference without retraining.

diffusion model (18) token pruning (2) training-free (2) attention-driven (2) at-edm (1)

Example domain paragraphs

Diffusion models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly with the attention module heavily used in leading models. Existing works mainly adopt a retraining process to enhance the efficiency, which is computationally expensive and less scalable.

To this end, we introduce the Attention-driven Training-free Efficient Diffusion Model ( AT-EDM ), a framework that leverages attention maps to perform run-time pruning of redundant tokens during inference without retraining.

Specifically, we develop a single denoising step pruning strategy, Generalized Weighted Page Rank (G-WPR), to identify redundant tokens and a similarity-based recovery method to restore tokens for convolution operation. Additionally, the Denoising-Steps-Aware Pruning (DSAP) is proposed to adjust the pruning budget across different denoising timesteps for better generation quality.

Links to atedm.github.io (1)