Abstract

Cardiovascular diseases are the leading cause of death across the world and responsible for about one third of all deaths. It is important to detect heart problems early and accurately, before serious damage can occur. Recording of the heart sounds, known as phonocardiograms (PCGs), is a non-invasive and inexpensive method for diagnosis. Nevertheless, the natural non-stationarity, noise and variability of PCG signals remain as a grand challenge for traditional DL methods. Although convolutional neural networks can effectively model the local features of spectrograms, it is difficult to model long-term dependencies in spectral feature maps. On the one hand, transformer based models may model temporal relationships but likely do not possess the capability to localize fine grain clinical patterns. To address the above problems, in this paper we introduce a Hybrid Architecture of Transformer-CNN with Temporal Attention (HTCTA). The model consists of CNNs in capturing localized time-frequency features, temporal attention to highlight diagnostically significant cardiac segments (e.g., systole or diastole), and Transformer encodings in pooling long-range dependencies over the heart cycle. Mel-spectrograms processed from heart sound recordings are forwarded through the hybrid model for classification. The proposed HTCTA model was tested over the two benchmark datasets, namely, PhysioNet CinC Challenge 2016 and CirCor DigiScope 2022. It reached a classification accuracy, precision, recall and F1-score of 94.70%, 94.20%, 95.15% and 94.67%, respectively, outperformed a number of state-of-the-art models, including Whisper-based as well as CRNN architectures. The model is moreover resistant to noise and variability between (auscultation) positions. Because of negligible difference between the reference and restored data, by virtue of being accurate, interpretable, efficient, HTCTA has potential in its real-time clinical diagnosis and medicine application. In the future, multimodal inputs will be incorporated and cross-patients validation will be performed to improve the generalization.

Keywords

Cardiovascular Disease (CVD), Heart Sound Classification, Phonocardiogram (PCG), Temporal Attention, Transformer Neural Network, Convolutional Neural Networks (CNNs), Hybrid Deep Learning,

Downloads

Download data is not yet available.

References

  1. A. Jabbar, E. Grooby, Y.Y. Poh, K.I. Ahmad, M. Hassanuzzaman, R. Mostafa, A.H. Khandoker, F. Marzbanrad, Automated detection of pediatric congenital heart disease from phonocardiograms using deep and handcrafted feature fusion. Computers in Biology and Medicine, 197, (2025) 110993. https://doi.org/10.1016/j.compbiomed.2025.110993
  2. E. Kalatehjari, M.M. Hosseini, A. Harimi, V. Abolghasemi, Advanced ensemble learning-based CNN-BiLSTM network for cardiovascular disease classification using ECG and PCG signal. Biomedical Signal Processing and Control, 108, (2025) 107846. https://doi.org/10.1016/j.bspc.2025.107846
  3. A. Florea, X. Jiang, N. Mesgarani, X. Jiang, Exploring finetuned audio-LLM on heart murmur features. Smart Health, (2025) 100557. https://doi.org/10.1016/j.smhl.2025.100557
  4. N. Chandrasekhar, S.C. Narahari, S. Kollem, S. Peddakrishna, A. Penchala, B.P. Chapa, Heart abnormality classification using ECG and PCG recordings with novel PJM-DJRNN. Results in Engineering, 25, (2025) 104032. https://doi.org/10.1016/j.rineng.2025.104032
  5. M. Morshed, S.A. Fattah, Deep learning based murmur detection from PCG signals collected at four valve locations using joint optimization and decision fusion. Results in Engineering, (2025) 107375. https://doi.org/10.1016/j.rineng.2025.107375
  6. V.M. Shervegar, Heart sound classification technique for early CVD detection using improved wavelet time scattering and discriminant analysis classifiers. Informatics and Health, 2(1), (2025) 49–62. https://doi.org/10.1016/j.infoh.2025.01.002
  7. I.D. Aabdalla, D. Vasumathi, A novel hybrid deep learning and reinforcement learning framework for multimodal cardiovascular disease prediction. International Journal of Advanced Computer Research, 15, (2025) 73. https://doi.org/10.19101/IJACR.2024.1466030
  8. I.D. Aabdalla, D. Vasumathi, Multi-algorithm optimisation for prediction of cardiovascular disease using ECG and PCG data. International Journal of Advanced Computer Research, 15, (2025) 71. https://doi.org/10.19101/IJACR.2024.1466027
  9. P.K. Popalzai, K.S. Khattak, A.M. Sohail, Z.H. Khan, Enhancing cardiac health diagnoses through machine learning analysis of phonocardiograms (PCG). Journal of Data Science and Intelligent Systems, 3(4), (2025). https://doi.org/10.47852/bonviewJDSIS52023774
  10. S. Sathyanarayanan, S. Murthy, S. Mallappa, C. Gudada, (2025). Machine learning approach using HOG and LBP features of spectrograms-based heart sounds analysis for the detection of heart diseases. In Proceedings of the 15th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2023), Lecture Notes in Networks and Systems, Springer, 1243. https://doi.org/10.1007/978-3-031-81080-0_31
  11. S. Hangaragi, N. Neelima, K. Jegdic, A. Nagarwal, Integrated fusion approach for multi-class heart disease classification through ECG and PCG signals with deep hybrid neural networks. Scientific Reports, 15(1), (2025) 8129. https://doi.org/10.1038/s41598-025-92395-w
  12. M.A.A. Al-Shannaq, A. Nasrawi, A.A.R.K. Bsoul, A.A. Saifan, Abnormal heart sound recognition using SVM and LSTM models in real-time mode. Scientific Reports, 15(1), (2025) 9129. https://doi.org/10.1038/s41598-025-89647-0
  13. A. Patwa, M.M.U. Rahman, T.Y. Al-Naffouri, Heart murmur and abnormal PCG detection via wavelet scattering transform and a 1D-CNN. IEEE Sensors Journal, 25(7), (2025) 12430–12443. https://doi.org/10.1109/JSEN.2025.3541320
  14. E.A. Nehary, S. Rajan, Phonocardiogram classification using dynamic mode decomposition for heterogeneity-resilient training. IEEE Open Journal of Instrumentation and Measurement, 4, (2025) 1-10. https://doi.org/10.1109/OJIM.2025.3605226
  15. S. Basak, U. Bhattacharya, Deep determination of cardiac condition from phonocardiograms. Neural Computing and Applications, 37(31), (2025)26099–26123. https://doi.org/10.1007/s00521-025-11617-4
  16. X. Yuan, X. Guo, Y. Luo, X. Guan, Q. Li, Z. Situ, Z. Zhou, X. Huang, Z. Rong, Y. Lin, M. Liu, PHNet: A pulmonary hypertension detection network based on cine cardiac magnetic resonance images using a hybrid strategy of adaptive triplet and binary cross-entropy losses. IEEE Transactions on Medical Imaging, IEEE, 44(7), (2025) 2960–2972. https://doi.org/10.1109/TMI.2025.3555621
  17. A. Bouatmane, A. Daaif, A. Bousselham, B. Bouihi, O. Bouattane, A multimodal deep learning model integrating CNN and transformer for predicting chemotherapy-induced cardiotoxicity. IEEE Access, 13, (2025) 57568-57588.https://doi.org/10.1109/ACCESS.2025.3556700
  18. B. Althaph, N.P. Challa, Explainable attention-based deep learning for classification and interpretation of heart murmurs using phonocardiograms. Scientific Reports, 15(1), (2025) 37991. https://doi.org/10.1038/s41598-025-21971-x
  19. Priyadarsini, I.S. Rao, P. Swetha, T. Anuradha, V. Sujatha, B. Divya, K.K. Kumar, A novel optimized machine learning approach for early prediction of heart disease using bio-inspired algorithms. Journal of Computer Science, 21(1), (2025)71–77. https://doi.org/10.3844/jcssp.2025.71.77
  20. J. Yi, P. Yu, T. Huang, Z. Xu, 2024) Optimization of transformer heart disease prediction model based on particle swarm optimization algorithm. In 2024 6th International Conference on Frontier Technologies of Information and Computer (ICFTIC), IEEE, Qingdao, China. https://doi.org/10.1109/ICFTIC64248.2024.10913096
  21. A.A. Ahmad, H. Polat, Prediction of heart disease based on machine learning using jellyfish optimization algorithm. Diagnostics, 13(14), (2023) 2392. https://doi.org/10.3390/diagnostics13142392
  22. M.G. Veerabaku, J. Nithiyanantham, S. Urooj, A.Q. Md, A.K. Sivaraman, K.F. Tee, Intelligent Bi-LSTM with architecture optimization for heart disease prediction in WBAN through optimal channel selection and feature selection. Biomedicines, 11(4), (2023) 1167. https://doi.org/10.3390/biomedicines11041167
  23. M.S.I. Sumon, M.S.B. Islam, M.S. Rahman, M.S.A. Hossain, A. Khandakar, A. Hasan, M.E. Chowdhury, CardioTabNet: a novel hybrid transformer model for heart disease prediction using tabular medical data. Health Information Science and Systems, 13(1), (2025) 44. https://doi.org/10.1007/s13755-025-00361-7
  24. C.N. Aher, S.N. Zaware, V.K. Harpale, V.S. Pawar, S.S. Vasekar, Squeeze RNN with hybrid optimization: a novel approach for heart disease prediction using gene expression data. Intelligent Decision Technologies, 19(2), (2025) 745–765. https://doi.org/10.1177/18724981241305875
  25. K.V.V. Reddy, I. Elamvazuthi, A.A. Aziz, S. Paramasivam, H.N. Chua, S. Pranavanand, An efficient prediction system for coronary heart disease risk using selected principal components and hyperparameter optimization. Applied Sciences, 13(1), (2022) 118. https://doi.org/10.3390/app13010118
  26. Lee, T. Kang, N. Kim, S. Han, H. Won, W. Gong, I. Kwak, Deep learning based heart murmur detection using frequency-time domain features of heartbeat sounds. Computing in Cardiology, 49, (2022). https://doi.org/10.22489/cinc.2022.071
  27. S. Moghani, H. Marvi, Z. Mohammadpoory, Enhanced heart sound analysis through hierarchical spectral basis vector extraction using deep orthogonal non-negative matrix factorization. The Journal of Supercomputing, 81(8), (2025) 899. https://doi.org/10.1007/s11227-025-07350-3
  28. H. Zhang, E. Li, Y. Tan, L. Shen, Y. Zhang, Y. Deng, K. Qian, K. Li, T. Nakamura, B. Hu, B.W. Schuller, Y. Yamamoto, (2025) A multi-class valvular heart disease diagnosis system using a two-stage lightweight model. In 2025 IEEE 14th Global Conference on Consumer Electronics (GCCE), IEEE, Osaka, Japan. https://doi.org/10.1109/GCCE65946.2025.11274582
  29. A. Vijayasimha, J. Avanija, Hybrid deep learning framework for cardiovascular disease diagnosis and prognosis using GAN, LSTM, GRU, VARMA, and deep DynaQ network. Scientific Reports, 15, (2025) 41346. https://doi.org/10.1038/s41598-025-25296-7
  30. Talal, S. Aziz, M.U. Khan, Y. Ghadi, S.Z.H. Naqvi, M. Faraz, Machine learning-based classification of multiple heart disorders from PCG signals. Expert Systems, 40(10), (2023) e13411. https://doi.org/10.1111/exsy.13411
  31. K. Taneja, V. Arora, K. Verma, Classifying the heart sound signals using textural-based features for an efficient decision support system. Expert Systems, 40(6), (2023) e13246. https://doi.org/10.1111/exsy.13246