Residential Collegefalse
Status已發表Published
Visual-linguistic Diagnostic Semantic Enhancement for medical report generation
Chen, Jiahong1; Huang, Guoheng1; Yuan, Xiaochen2; Zhong, Guo3; Tan, Zhe1; Pun, Chi Man4; Yang, Qi5
2025
Source PublicationJournal of Biomedical Informatics
ISSN1532-0464
Volume161
AbstractGenerative methods are currently popular for medical report generation, as they automatically generate professional reports from input images, assisting physicians in making faster and more accurate decisions. However, current methods face significant challenges: 1) Lesion areas in medical images are often difficult for models to capture accurately, and 2) even when captured, these areas are frequently not described using precise clinical diagnostic terms. To address these problems, we propose a Visual-Linguistic Diagnostic Semantic Enhancement model (VLDSE) to generate high-quality reports. Our approach employs supervised contrastive learning in the Image and Report Semantic Consistency (IRSC) module to bridge the semantic gap between visual and linguistic features. Additionally, we design the Visual Semantic Qualification and Quantification (VSQQ) module and the Post-hoc Semantic Correction (PSC) module to enhance visual semantics and inter-word relationships, respectively. Experiments demonstrate that our model achieves promising performance on the publicly available IU X-RAY and MIMIC-MV datasets. Specifically, on the IU X-RAY dataset, our model achieves a BLEU-4 score of 18.6%, improving the baseline by 12.7%. On the MIMIC-MV dataset, our model improves the BLEU-1 score by 10.7% over the baseline. These results demonstrate the ability of our model to generate accurate and fluent descriptions of lesion areas.
KeywordContrastive learning Medical report generation Semantic consistency Semantic Enhancement
DOI10.1016/j.jbi.2024.104764
URLView the original
Language英語English
Scopus ID2-s2.0-85214292237
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China
2.Faculty of Applied Sciences, Macao Polytechnic University, 999078, Macao
3.School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, 510006, China
4.Department of Computer and Information Science, University of Macau, 999078, Macao
5.Department of Nasopharyngeal Carcinoma, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Guangzhou, 510060, China
Recommended Citation
GB/T 7714
Chen, Jiahong,Huang, Guoheng,Yuan, Xiaochen,et al. Visual-linguistic Diagnostic Semantic Enhancement for medical report generation[J]. Journal of Biomedical Informatics, 2025, 161.
APA Chen, Jiahong., Huang, Guoheng., Yuan, Xiaochen., Zhong, Guo., Tan, Zhe., Pun, Chi Man., & Yang, Qi (2025). Visual-linguistic Diagnostic Semantic Enhancement for medical report generation. Journal of Biomedical Informatics, 161.
MLA Chen, Jiahong,et al."Visual-linguistic Diagnostic Semantic Enhancement for medical report generation".Journal of Biomedical Informatics 161(2025).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Chen, Jiahong]'s Articles
[Huang, Guoheng]'s Articles
[Yuan, Xiaochen]'s Articles
Baidu academic
Similar articles in Baidu academic
[Chen, Jiahong]'s Articles
[Huang, Guoheng]'s Articles
[Yuan, Xiaochen]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Chen, Jiahong]'s Articles
[Huang, Guoheng]'s Articles
[Yuan, Xiaochen]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.