Diffusion Models for Medical Image Computing: A Survey

Yaqing Shi; Abudukelimu Abulizi; Hao Wang; Ke Feng; Nihemaiti Abudukelimu; Youli Su; Halidanmu Abudukelimu

doi:10.26599/TST.2024.9010047

| Sign up

PDF (8.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

Diffusion Models for Medical Image Computing: A Survey

Yaqing Shi^¹, Abudukelimu Abulizi^¹(), Hao Wang^¹, Ke Feng^¹, Nihemaiti Abudukelimu^², Youli Su^¹, Halidanmu Abudukelimu^¹

1School of Information Management, Xinjiang University of Finance and Economics, Urumqi 830012, China

2Yili Friendship Hospital, Yining 835000, China

Show Author Information

Abstract

Diffusion models are a type of generative deep learning model that can process medical images more efficiently than traditional generative models. They have been applied to several medical image computing tasks. This paper aims to help researchers understand the advancements of diffusion models in medical image computing. It begins by describing the fundamental principles, sampling methods, and architecture of diffusion models. Subsequently, it discusses the application of diffusion models in five medical image computing tasks: image generation, modality conversion, image segmentation, image denoising, and anomaly detection. Additionally, this paper conducts fine-tuning of a large model for image generation tasks and comparative experiments between diffusion models and traditional generative models across these five tasks. The evaluation of the fine-tuned large model shows its potential for clinical applications. Comparative experiments demonstrate that diffusion models have a distinct advantage in tasks related to image generation, modality conversion, and image denoising. However, they require further optimization in image segmentation and anomaly detection tasks to match the efficacy of traditional models. Our codes are publicly available at: https://github.com/hiahub/CodeForDiffusion.

Keywords

diffusion models generative models medical image large model

References

[1]

X. Chen, X. Wang, K. Zhang, K. M. Fung, T. C. Thai, K. Moore, R. S. Mannel, H. Liu, B. Zheng, and Y. Qiu, Recent advances and clinical applications of deep learning in medical image analysis, Med. Image Anal., vol. 79, p. 102444, 2022.

Crossref Google Scholar

[2]

G. Varoquaux and V. Cheplygina, Machine learning for medical imaging: Methodological failures and recommendations for the future, NPJ Digit. Med., vol. 5, no. 1, p. 48, 2022.

Crossref Google Scholar

[3]

Y. Zhao, X. Wang, T. Che, G. Bao, and S. Li, Multi-task deep learning for medical image computing and analysis: A review, Comput. Biol. Med., vol. 153, p. 106496, 2023.

Crossref Google Scholar

[4]

S. Wang, G. Cao, Y. Wang, S. Liao, Q. Wang, J. Shi, C. Li, and D. Shen, Review and prospect: Artificial intelligence in advanced medical imaging, Front. Radiol., vol. 1, p. 781868, 2021.

Crossref Google Scholar

[5]

A. Esteva, K. Chou, S. Yeung, N. Naik, A. Madani, A. Mottaghi, Y. Liu, E. Topol, J. Dean, and R. Socher, Deep learning-enabled medical computer vision, NPJ Digit. Med., vol. 4, no. 1, p. 5, 2021.

Crossref Google Scholar

[6]

J. M. B. Haslbeck, L. F. Bringmann, and L. J. Waldorp, A tutorial on estimating time-varying vector autoregressive models, Multivar. Behav. Res., vol. 56, no. 1, pp. 120–149, 2021.

Crossref Google Scholar

[7]

A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, Generative adversarial networks: An overview, IEEE Signal Process. Mag., vol. 35, no. 1, pp. 53–65, 2018.

Crossref Google Scholar

[8]

D. P. Kingma and M. Welling, An introduction to variational autoencoders, Found. Trends Mach. Learn., vol. 12, no. 4, pp. 307–392, 2019.

Crossref Google Scholar

[9]

A. Kazerouni, E. K. Aghdam, M. Heidari, R. Azad, M. Fayyaz, I. Hacihaliloglu, and D. Merhof, Diffusion models in medical imaging: A comprehensive survey, Med. Image Anal., vol. 88, p. 102846, 2023.

Crossref Google Scholar

[10]

L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui, and M. H. Yang, Diffusion models: A comprehensive survey of methods and applications, ACM Comput. Surv., vol. 56, no. 4, p. 105, 2024.

Crossref Google Scholar

[11]

J. Ho, A. Jain, and P. Abbeel, Denoising diffusion probabilistic models, in Proc. 34^th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, p. 574.

[12]

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, Hierarchical text-conditional image generation with CLIP latents, arXiv preprint arXiv: 2204.06125, 2022.

[13]

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, High-resolution image synthesis with latent diffusion models, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 10684–10695.