Abstract

In this paper, we proposed a context-aware knowledge graph transformer framework for improving the caption of chest X-ray images. Normally the role of a radiologist is to interpret the chest X-ray or MRI image and write a detailed summary of finding patterns in a report. To generate an automatic detailed summary of the image the proposed framework is divided into three steps. The first step captures the visual feature of images using computer vision algorithms as Resnet 50 and Alexnet. The Second step uses the knowledge graph layer is employed for calculating the similarity between the tokens based on angel and token overlap to generate context-aware meaning of each token. The third step utilizes the transformer-based decoder to generate the detailed caption. The performance of the proposed model is compared against existing baselines including LSTM, CONV2D, and BI-LSTM architectures. The Proposed model outperforms baseline models by achieving higher evaluation scores in terms of evaluation metrics as 63%  (BLEU-1), 61% (BLEU-4), 79% (RIBES), 85% (precision), 82% (recall), 82% (SPICE),and 79% (METEOR) demonstrating its effectiveness in medical text summarization.

Keywords

Medical Image Captioning, Knowledge Graph, Transformers, Cosine Similarity, Jaccard Similarity,

Downloads

Download data is not yet available.

References

  1. T. Ghandi, H. Pourreza, H. Mahyar, Deep learning approaches on image captioning: A review. ACM Computing Surveys, 56(3), (2023) 1-39. https://doi.org/10.1145/3617592
  2. Y. Lin, K. Lai, W. Chang, Skin medical image captioning using multi-label classification and Siamese network. IEEE Access, 11, (2023) 23447-54. https://doi.org/10.1109/ACCESS.2023.3249462
  3. J.H. Moon, H. Lee, W. Shin, Y.H. Kim, E. Choi, Multi-modal understanding and generation for medical images and text via vision-language pre-training. IEEE Journal of Biomedical and Health Informatics, 26(12), (2022) 6070-6080. https://doi.org/10.1109/JBHI.2022.3207502
  4. Z. Wang, H. Han, L. Wang, X. Li, L. Zhou, Automated radiographic report generation purely on transformer: A multicriteria supervised approach. IEEE Transactions on Medical Imaging, 41(10), (2022) 2803-13. https://doi.org/10.1109/TMI.2022.3171661
  5. Y. Zhang, X. Wang, Z. Xu, Q. Yu, A. Yuille, D. Xu, When radiology report generation meets knowledge graph. InProceedings of the AAAI conference on artificial intelligence, 34(7), (2020) 12910-12917. https://doi.org/10.1609/aaai.v34i07.6989
  6. Y. Peng, Y. Tang, S. Lee, Y. Zhu, R.M. Summers, Z. Lu, COVID 19 CT CXR: a freely accessible and weakly labelled chest X ray and CT image collection on COVID 19 from biomedical literature. IEEE Transactions on Big Data, 7(1), (2020) 3 12. https://doi.org/10.1109/TBDATA.2020.3035935
  7. D. Singh, M. Kaur, J.M. Alanazi, A.A. AlZubi, H.N. Lee, Efficient Evolving Deep Ensemble Medical Image Captioning Network. IEEE Journal of Biomedical and Health Informatics, 27(2), (2023) 1016–25. https://doi.org/10.1109/JBHI.2022.3223181
  8. D. Hou, Z. Zhao, Y. Liu, F. Chang, S. Hu, Automatic report generation for chest X ray images via adversarial reinforcement learning. IEEE Access, 9, (2021) 21236–21250. https://doi.org/10.1109/ACCESS.2021.3056175
  9. F. Wang, X. Liang, L. Xu, L. Lin, Unifying relational sentence generation and retrieval for medical image report composition. IEEE Transactions on Cybermetrics, 52(6), (2020) 5015–5025. https://doi.org/10.1109/TCYB.2020.3026098
  10. M.M. Mohsan, M.U. Akram, G. Rasool, N.S. Alghamdi, M.A.A. Baqai, M. Abbas, Vision Transformer and Language Model Based Radiology Report Generation. IEEE Access, 11, (2023) 1814–1824. https://doi.org/10.1109/ACCESS.2022.32327
  11. H. Park, K. Kim, S. Park, J. Choi, Medical image captioning model to convey more details: Methodological comparison of feature difference generation. IEEE Access, 9, (2021) 150560–150568.
  12. W. Wang, R. Wang, X. Chen, (2021) Topic scene graph generation by attention distillation from caption. In Proceedings of the IEEE/CVF international conference on computer vision, IEEE, Montreal, QC, Canada, 15900–15910. https://doi.org/10.1109/ICCV48922.2021.01560
  13. P. Qi, Z. Huang, Y. Sun, H. Luo, (2022) A Knowledge Graph Based Abstractive Model Integrating Semantic and Structural Information for Summarizing Chinese Meetings. In Proceedings IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), IEEE, Hangzhou, China, 746–751. https://doi.org/10.1109/CSCWD54268.2022.9776298
  14. J. Guo, Y. Wang, (2021) Summarizing RDF graphs using Node Importance and Query History. In Proceedings IEEE 2021 International Conference on Service Science (ICSS),IEEE, Xi'an, China, https://doi.org/10.1109/ICSS53362.2021.0001
  15. M. Aamir, A.U. Jan, N. Mukhtar, M.A. Khan, Z. Ali, W.A. Abro, Y. Guan, An unsupervised graph-based hybrid approach for opinion summarization. In Proceedings 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, Chengdu, China, 83–88. https://doi.org/10.1109/ICCWAMTIP53232.2021.9674086
  16. H. Zhan, K. Zhang, C. Hu, V.S. Sheng, (2021) Gated Graph Neural Networks (GG NNs) for Abstractive Multi Comment Summarization. In Proceeding IEEE Int Conf Big Knowledge (ICBK), IEEE, Auckland, New Zealand, 323–330. https://doi.org/10.1109/ICKG52313.2021.00050
  17. U. Barman, V. Barman, M. Rahman, N.K. Choudhury, Graph based extractive news articles summarization approach leveraging static word embeddings. In Proceedings 2021 International Conference on Computational Performance Evaluation (ComPE), IEEE, Shillong, India, 8–11. https://doi.org/10.1109/ComPE53109.2021.9752056
  18. R. Jalota, D. Vollmers, D. Moussallem, A.C.N. Ngomo. (2021) LAUREN – Knowledge Graph Summarization for Question Answering. In Proceeding IEEE 15th International Conference on Semantic Computing (ICSC),IEEE, Laguna Hills, CA, USA, 221–226. https://doi.org/10.1109/ICSC50631.2021.00047
  19. E. Yang, F. Hao, J. Gao, Y. Wu, G. Min, (2020) Entity spatio temporal evolution summarization in knowledge graphs. In 2020 IEEE International Conference on Knowledge Graph (ICKG), IEEE, Nanjing, China, 181–187. https://doi.org/10.1109/ICBK50248.2020.00035
  20. T. Yao, Y. Pan, Y. Li, T. Mei, (2019) Hierarchy Parsing for Image Captioning. In Proceeding IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, Seoul, Korea (South), 2621–2629. https://doi.org/10.1109/ICCV.2019.00271
  21. A. Jangra, S. Mukherjee, A. Jatowt, S. Saha, M. Hasanuzzaman, A survey on multi modal summarization. ACM Computing Surveys, 55(13s), (2023) 1-36. https://doi.org/10.1145/3584700
  22. S.K. Uppada, P. Patel, B. Sivaselvan, An image and text based multimodal model for detecting fake news in OSN’s. Journal of Intelligent Information Systems, 61(2), (2023) 367–393. https://doi.org/10.1007/s10844-022-00764-y
  23. B. He, J. Wang, J. Qiu, T. Bui, A. Shrivastava, Z. Wang, (2023) Align and Attend: Multimodal Summarization with Dual Contrastive Losses. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, IEEE, Vancouver, BC, Canada. https://doi.org/10.1109/CVPR52729.2023.01428
  24. M. Xiao, J. Zhu, H. Lin, Y. Zhou, C. Zong, (2023) CFSum: A Coarse-to-Fine Contribution Network for Multimodal Summarization. arXiv preprint arXiv:2307.02716. https://doi.org/10.48550/arXiv.2307.02716
  25. T. Gigant, F. Dufaux, C. Guinaudeau, M. Décombas, (2023) TIB: A Dataset for Abstractive Summarization of Long Multimodal Videoconference Records. In Proceedings of the 20th International Conference on Content-based Multimedia Indexing, 61-70. https://doi.org/10.1145/3617233.3617238
  26. J. Li, X. Wang, Y. Zhu, Y. Zhang, J. Tang, Elastic deep multi-view autoencoder with .0 diversity embedding. Neurocomputing, 2022. 512 41. https://doi.org/10.1016/j.neucom.2022.09.001
  27. D. Jha, S. Saha, N. Dey, Automatic colorectal cancer detection using machine learning and deep learning based on feature selection in histopathological images. Applied Soft Computing, 112, (2021) 107813. https://doi.org/10.1016/j.asoc.2021.107813
  28. Z. Wang, Y. Liu, X. Hu, Image captioning by diffusion models: a survey. Information Fusion, 93, (2023) 130–145. https://doi.org/10.1016/j.inffus.2023.04.002