Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models
Introduction
Hybrid SD is a novel framework designed for edge-cloud collaborative inference of Stable Diffusion Models. By integrating the superior large models on cloud servers and efficient small models on edge devices, Hybrid SD achieves state-of-the-art parameter efficiency on edge devices with competitive visual quality.
To use hybrid SD for inference, you can launch the scripts/hybrid_sd/hybird_sd.sh, please specify the large and small models. For hybrid inference for SDXL models, please refer to scripts/hybrid_sd/hybird_sdxl.sh accordingly.
Optional arguments
PATH_MODEL_LARGE: the large model path.
PATH_MODEL_SMALL: the small model path.
--step: the steps distributed to different models. (e.g., “10,15” means the first 10 steps are distributed to the large model, while the last 15 steps are shifted to the small model.)
--seed: the random seed.
--img_sz: the image size.
--prompts_file: put prompts in the .txt file.
--output_dir: the output directory for saving generated images.
Latent Consistency Models (LCMs)
To use hybrid SD for LCMs, you can launch the scripts/hybrid_sd/hybird_lcm.sh and specify the large model and small model. You also need to pass TEACHER_MODEL_PATH to load VAE, tokenizer, and Text Encoder.
Evaluation on MS-COCO Benchmark
Evaluate hybrid inference with SD Models on MS-COCO 2014 30K.
bash scripts/hybrid_sd/generate_dpm_eval.sh
Evaluate hybrid inference with LCMs on MS-COCO 2014 30K.
bash scripts/hybrid_sd/generate_lcm_eval.sh
Training
Pruning U-Net
# pruning U-Net through significance score.
bash scripts/prune_sd/prune_tiny.sh
# finetuning the pruned U-Net.
bash scripts/prune_sd/kd_finetune_tiny.sh
Following BK-SDM, we use the dataset preprocessed_212k.
We optimize VAE with LPIPS loss and adversarial loss.
We adopt the discriminator from StyelGAN-t along with several data augmentation and degradation techniques for VAE enhancement.
Training LCMs
Training accelerated Latent consistency models (LCM) using the following scripts.
Distilling SD models to LCMs
bash scripts/hybrid_sd/lcm_t2i_sd.sh
Distilling Pruned SD models to LCMs
bash scripts/hybrid_sd/lcm_t2i_tiny.sh
Results
Hybrid SDXL Inference
VAEs
Our tiny VAE vs. TAESD
Ours VAE shows better visual quality and detail refinements than TAESD. Ours VAE also achieves better FID scores than TAESD on MSCOCO 2017 5K datasets.
@article{yan2024hybrid,
title={Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models},
author={Yan, Chenqian and Liu, Songwei and Liu, Hongjian and Peng, Xurui and Wang, Xiaojian and Chen, Fangming and Fu, Lean and Mei, Xing},
journal={arXiv preprint arXiv:2408.06646},
year={2024}
}
Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models
Introduction
Hybrid SD is a novel framework designed for edge-cloud collaborative inference of Stable Diffusion Models. By integrating the superior large models on cloud servers and efficient small models on edge devices, Hybrid SD achieves state-of-the-art parameter efficiency on edge devices with competitive visual quality.
Installation
Pretrained Models
We provide a number of pretrained models as follows:
Hybrid Inference
SD Models
To use hybrid SD for inference, you can launch the
scripts/hybrid_sd/hybird_sd.sh, please specify the large and small models. For hybrid inference for SDXL models, please refer toscripts/hybrid_sd/hybird_sdxl.shaccordingly.Optional arguments
PATH_MODEL_LARGE: the large model path.PATH_MODEL_SMALL: the small model path.--step: the steps distributed to different models. (e.g., “10,15” means the first 10 steps are distributed to the large model, while the last 15 steps are shifted to the small model.)--seed: the random seed.--img_sz: the image size.--prompts_file: put prompts in the .txt file.--output_dir: the output directory for saving generated images.Latent Consistency Models (LCMs)
To use hybrid SD for LCMs, you can launch the
scripts/hybrid_sd/hybird_lcm.shand specify the large model and small model. You also need to passTEACHER_MODEL_PATHto load VAE, tokenizer, and Text Encoder.Evaluation on MS-COCO Benchmark
Evaluate hybrid inference with SD Models on MS-COCO 2014 30K.
Evaluate hybrid inference with LCMs on MS-COCO 2014 30K.
Training
Pruning U-Net
Following BK-SDM, we use the dataset preprocessed_212k.
Training our lightweight VAE
Note
Training LCMs
Training accelerated Latent consistency models (LCM) using the following scripts.
Results
Hybrid SDXL Inference
VAEs
Our tiny VAE vs. TAESD
Ours VAE shows better visual quality and detail refinements than TAESD. Ours VAE also achieves better FID scores than TAESD on MSCOCO 2017 5K datasets.
Our small VAE vs. Baseline
Acknowledgments
Citation
If you find our work helpful, please cite it!
License
This project is licensed under the Apache-2.0 License.