Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search
Diffusion models have recently shown remarkable generation ability, achieving state-of-the-art performance in many tasks. However, the high computational cost is still a troubling problem for diffusion models. To tackle this problem, we propose to automatically remove the structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (DiffNAS). Specifically, given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher. Considering current diffusion models are based on UNet which naturally has a block-wise structure, we perform neural architecture search independently in each block, which largely reduces the search space. Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss. Concretely, during the search process, we block-wisely select the best subnet to avoid the unfairness brought by the global search strategy used in previous works. When retraining the searched architecture, we adopt a dynamic joint loss to maintain the consistency between supernet training and subnet retraining, which also provides informative objectives for each block and shortens the paths of gradient propagation. We demonstrate this joint loss can effectively improve model performance. We also prove the necessity of the dynamic adjustment of this loss. The experiments show that our method can achieve significant computational reduction, especially on latent diffusion models with about 50% MACs and Parameter reduction.
updated: Wed Nov 15 2023 07:37:28 GMT+0000 (UTC)
published: Wed Nov 08 2023 12:56:59 GMT+0000 (UTC)