Constrained Diffusion with Trust Sampling

Stanford University
Teaser figure.

Trust sampling, a training-free loss-guided diffusion method for SoTA constraint following in image and human motion tasks.

Abstract

Diffusion models have demonstrated significant promise in various generative tasks; however, they often struggle to satisfy challenging constraints. Our approach addresses this limitation by rethinking training-free loss-guided diffusion from an optimization perspective. We formulate a series of constrained optimizations throughout the inference process of a diffusion model. In each optimization, we allow the sample to take multiple steps along the gradient of the proxy constraint function until we can no longer trust the proxy, according to the variance at each diffusion level. Additionally, we estimate the state manifold of diffusion model to allow for early termination when the sample starts to wander away from the state manifold at each diffusion step. Trust sampling effectively balances between following the unconditional diffusion model and adhering to the loss guidance, enabling more flexible and accurate constrained generation. We demonstrate the efficacy of our method through extensive experiments on complex tasks, and in drastically different domains of images and 3D motion generation, showing significant improvements over existing methods in terms of generation quality. Our implementation is available at this https://github.com/will-s-h/trust-sampling.

Image Results

Comparison of Trust Sampling with other methods.
The above is a comparison of Trust Sampling with other loss-guided constraint following diffusion methods, including DPS, DSG, and LGD-MC.
Gaussian Deblurring samples.
Above are 20 samples of Gaussian Deblurring across FFHQ and ImageNet.
Inpainting samples.
Above are 20 samples of Inpainting across FFHQ and ImageNet.
Super Resolution samples.
Above are 20 samples of Super Resolution across FFHQ and ImageNet.

Human Motion Results

Human Motion Examples
Above shows keyframes of human motions constrained by Trust Sampling.

BibTeX

@article{huang2024trust,
  author    = {Huang, William and Jiang, Yifeng and Van Wouwe, Tom and Liu, C Karen},
  title     = {Constrained Diffusion with Trust Sampling},
  journal   = {NeurIPS},
  year      = {2024},
}