Jittor 风格迁移图片生成比赛 DreamBooth

简介

本项目包含了第四届计图挑战赛计图 - 风格迁移图片生成比赛的代码实现。本项目的特点是：基于StableDiffusion 2.1与DreamBooth，尝试了Mask Attention, SNR, Prior Loss, Prompt Engineering, BNN, Dropout, StyleAligned等多种方法，达到了0.4796的总分。

环境安装

首先按照 JDiffusion 的安装指导安装必要的依赖，除此之外还需要安装 peft 库依赖。

这两部分依赖已经整理在requirements.txt中，可以直接运行以下命令安装：

pip install -r requirements.txt

训练

首先从比赛云盘下载对应的数据集；
将 train_all.sh 中的 BASE_INSTANCE_DIR 设置为数据集对应的目录，GPU_COUNT 设置为对应可用的显卡数量，MAX_NUM 设置为数据集中的风格个数；

对于每个风格下面的prompt文件，需要修改成类似下面的格式：

      {
       "style": "<style_xx>some description for the style</style_xx>",
       "targets": {
           "0": {
               "name": "Library",
               "prompt": "A grand library with bookshelves and reading tables"
           },
           "1": {
               "name": "Tomato",
               "prompt": "Ripe red tomato on a vine"
           }
         }
      }

其中 style 为风格描述，targets 为对应的目标描述。

然后运行 bash train_all.sh 即可训练。

推理

将 run_all.py 中的 dataset_root 修改为数据集对应的目录，将 max_num 修改为数据集中的风格个数；
运行 python run_all.py 进行训练，对应的图片会输出到 output 文件夹。

致谢

本项目基于DreamBooth框架完成。

@inproceedings{ruiz2023dreambooth,
  title={Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation},
  author={Ruiz, Nataniel and Li, Yuanzhen and Jampani, Varun and Pritch, Yael and Rubinstein, Michael and Aberman, Kfir},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}