In medical X-ray images, multiple abnormalities may occur frequently. However, existing report generation methods cannot efficiently extract all abnormal features, resulting in incomplete disease diagnoses when generating diagnostic reports. In real medical scenarios, there are co-occurrence relations among multiple diseases. If such co-occurrence relations are mined and integrated into the feature extraction process, the issue of missing abnormal features may be addressed. Inspired by this observation, we propose a novel method to improve the extraction of abnormal features in images through joint probability graph reasoning. Specifically, to reveal the co-occurrence relations among multiple diseases, we conduct statistical analyses on the dataset, and extract disease relationships into a probability map. Subsequently, we devise a graph reasoning network for conducting correlation-based reasoning over the features of medical images, which can facilitate the acquisition of more abnormal features. Furthermore, we introduce a gating mechanism focused on cross-modal features fusion into the current text generation model. This optimization substantially improves the model’s capabilities to learn and fuse information from two distinct modalities—medical images and texts. Experimental results on the IU-X-Ray and MIMIC-CXR datasets demonstrate that our approach outperforms previous state-of-the-art methods, exhibiting the ability to generate higher quality medical image reports.
Publications
Year

Tsinghua Science and Technology 2025, 30(4): 1685-1699
Published: 03 March 2025
Downloads:45