AdjointDPM is a method that can not only finetune the parameters of DPMs, including network parameters and text embedding, but also perform guided sampling with accurate gradient guidance, based on a differentiable metric defined on the generated contents.
There are several interesting experiments to demonstrate the effectiveness of AdjointDPM. For the finetuning tasks, including stylization and text embedding inversion, they are implemented based on adjoint. For the guided sampling, they are implemented in img_guided_sampling. For security auditing under an ImageNet classifier, we implement the code heavily based on dpm-solver codebase. Check it in ddpm_and_guided-diffusion.
First, download and set up the repo:
git clone https://github.com/HanshuYAN/AdjointDPM.git
cd AdjointDPM
We provide an environment.yml
file that can be used to create a Conda environment.
conda env create -f environment.yml
conda activate adjoint
We propose Symplectic Adjoint Guidance (SAG), a training-free guided diffusion process that supports various image and video generation tasks, including style-guided image generation, aesthetic improvement, personalization and video stylization. We illustrate the framework of training-free guided generation through symplectic adjoint guidance using a stylization example. When we denoise Gaussian noise to an image across various steps, we can add guidance (usually defined as gradients of loss function on the estimate of
To run guided sampling using symplectic adjoint method, we need to install symplectic torchdiffeq:
cd img_guided_sampling
cd symplectic-adjoint-method-beta
python setup.py install
One can use SAG to do stylization in both image and video generation.
One can improve the aesthetic quality of generated outputs by using SAG.
One can also do personalization by using SAG.
One can do adversarial attack by using our proposed AdjointDPM. We show some adversarial examples against the ImageNet classifier. We show the originally generated images with their class names on the left; these images are correctly classified by ResNet50. On the right, we show the corresponding adversarial images which successfully mislead the classifier.