stable diffusion：潜在扩散模型

参考：

1、https://zhuanlan.zhihu.com/p/573984443

2、https://zhangzhenhu.github.io/blog/aigc

3、 https://zhuanlan.zhihu.com/p/599160988

扩散概率模型（diffusion probabilistic models）

1、扩散概率模型（diffusion probabilistic model）

2、降噪扩散概率模型（Denoising diffusion probabilistic model,DDPM）

3、基于分数的解释（Score-based DDPM）

4、扩散模型的三种等价表示

5、改进降噪扩散概率模型（Improved Denoising Diffusion Probabilistic Models,IDDPM）

6. 参考文献

Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. 2015. arXiv:1503.03585.2(1,2,3,4,5,6,7)

Calvin Luo. Understanding diffusion models: a unified perspective. 2022. arXiv:2208.11970.3(1,2,3,4)

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. 2020. arXiv:2006.11239.4

Diederik P. Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. 2022. arXiv:2107.00630.5

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. 2019. arXiv:1907.05600.

去噪扩散隐式模型（Denoising Diffusion Implicit Models,DDIM）

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. 2022. arXiv:2010.02502.

基于分数的生成模型

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. 2019. arXiv:1907.05600.

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. 2021. arXiv:2011.13456.

Aapo Hyvärinen and Peter Dayan. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 2005.

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. 2020. arXiv:2006.09011.

条件控制扩散模型

Prafulla Dhariwal and Alex Nichol. Diffusion models beat gans on image synthesis. 2021. arXiv:2105.05233.2(1,2)

Calvin Luo. Understanding diffusion models: a unified perspective. 2022. arXiv:2208.11970.3

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance. 2022. arXiv:2207.12598.4

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: towards photorealistic image generation and editing with text-guided diffusion models. 2022. arXiv:2112.10741.5

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents. 2022. arXiv:2204.06125.6

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. Photorealistic text-to-image diffusion models with deep language understanding. 2022. arXiv:2205.11487.

稳定扩散模型（Stable diffusion model）

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. 2021. arXiv:2112.10752.

DDPM 模型在生成图像质量上效果已经非常好，但它也有个缺点，那就是尺寸是和图片一致的，元素和图片的像素是一一对应的，所以称 DDPM 是像素(pixel)空间的生成模型。我们知道一张图片的尺寸，如果想生成一张高尺寸的图像，张量大小是非常大的，这就需要极大的显卡（硬件）资源，包括计算资源和显存资源。同样的，它的训练成本也是高昂的。高昂的成本极大的限制了它在民用领用的发展

潜在扩散模型

2021年德国慕尼黑路德维希-马克西米利安大学计算机视觉和学习研究小组（原海德堡大学计算机视觉小组），简称 CompVis 小组，发布了论文 High-Resolution Image Synthesis with Latent Diffusion Models 1，针对这个问题做了一些改进，主要的改进点有：

引入一个自编码器，先对原始对象进行压缩编码，编码后的向量再应用到扩散模型。
通过在 UNET 中加入 Attention 机制，处理条件变量

扩散概率模型（diffusion probabilistic models）

去噪扩散隐式模型（Denoising Diffusion Implicit Models,DDIM）

基于分数的生成模型

条件控制扩散模型

稳定扩散模型（Stable diffusion model）

潜在扩散模型

相关文章：

发表评论 取消回复

发表评论取消回复