Abstract

The liver which serves as a crucial human organ sustains metabolic operations while preserving health levels across the body. Successful treatment along with improved patient results require both quick and precise detection of liver disease through diagnosis. This study uses Machine Learning (ML) algorithms including Random Forest (RF), Categorical Boosting (CB), Adaptive Boosting (AB), Light Gradient Boosting Machine (LGBM) and Support Vector Classification (SVC) together with Logistic Regression (LR) to establish a method for Liver Disease (LD) prediction. Our research investigated numerous methods to assess their successful classification ability for liver disease with accuracy rate evaluations alongside precision and recall statistics and F1 score calculations. Notably, to make the reliability and generalizability of each model, the cross-validation was carried out. Our multi-algorithmic strategy boosts prediction stability by providing both precise analysis of each algorithm's predictive capabilities and their respective weaknesses in liver disease forecasting. CB due to its high accuracy and robustness, has produced the best results in comparison with other algorithms, which were analyzed. The ability to effectively deal with categorical variables and the low level of intensive preprocessing greatly contributed to its better performance in comparison with conventional models.

Keywords

Liver Disease, Disease Prediction, Machine Learning, Algorithm, Classifying,

Downloads

Download data is not yet available.

References

  1. S.K. Asrani, H. Devarbhavi, J. Eaton, P.S. Kamath, Burden of liver diseases in the world, Journal of hepatology, 70(1), (2019) 151–171.
  2. C. Castaneda, K. Nalley, C. Mannion, P. Bhattacharyya, P. Blake, A. Pecora, A. Goy, K.S. Suh, Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. Journal of Clinical Bioinformatics, 5(1), (2015) 4. https://doi.org/10.1186/s13336-015-0019-3
  3. S.M. Mahmud, M.A. Hossin, M.R. Ahmed, S.R.H. Noori, M.N.I. Sarkar, Machine Learning Based Unified Framework for Diabetes Prediction. Proceedings of the 2018 International Conference on Big Data Engineering and Technology. ACM (2018), 46-50. https://doi.org/10.1145/3297730.3297737
  4. S.N.N. Alfisahrin, T. Mantoro, (2013) Data mining techniques for optimization of liver disease classification, In 2013 International Conference on Advanced Computer Science Applications and Technologies, IEEE, Kuching, Malaysia, https://doi.org/10.1109/ACSAT.2013.81
  5. R. Moreau, M. Tonon, A. Krag, P. Angeli, M. Berenguer, A. Berzigotti, J. Fernandez, C. Francoz, T. Gustot, R. Jalan, M. Papp, EASL Clinical Practice Guidelines on acute-on-chronic liver failure. Journal of Hepatology, 79(2), (2023) 461-491. https://doi.org/10.1016/j.jhep.2024.03.012
  6. C. Hsu, C. Caussy, K. Imajo, J. Chen, S. Singh, K. Kaulback, M.D. Le, J. Hooker, X. Tu, R. Bettencourt, M. Yin, Magnetic resonance vs transient elastography analysis of patients with nonalcoholic fatty liver disease: a systematic review and pooled analysis of individual participants. Clin Gastroenterol Hepatol, Elsevier 17(4), (2019) 630–637. https://doi.org/10.1016/j.cgh.2018.05.059
  7. T. Hydes, M. Moore, B. Stuart, M. Kim, F. Su, C. Newell, D. Cable, A. Hales, N. Sheron, Can routine blood tests be modelled to detect advanced liver disease in the community: model derivation and validation using UK primary and secondary care data. British Medical Journal (BMJ), 11(2), (2021) e044952. https://doi.org/10.1136/bmjopen-2020-044952
  8. N. Rana, K. Sharma, A. Sharma, Diagnostic Strategies Using AI and ML in Cardiovascular Diseases: Challenges and Future Perspectives. In: Dulhare, U.N., Houssein, E.H. (eds) Deep Learning and Computer Vision: Models and Biomedical Applications. Algorithms for Intelligent Systems. Springer, Singapore, 1, (2025) 135-165. https://doi.org/10.1007/978-981-96-1285-7_7
  9. E.H. Houssein, M.E. Hosney, W.M. Mohamed, A.A. Ali, E.M. Younis, Fuzzy-based hunger games search algorithm for global optimization and feature selection using medical data. Neural Computing and Application, Springer, 35, (2023) 5251–5275. https://doi.org/10.1007/s00521-022-07916-9
  10. M. Wang, S. Tang, G. Li, Z. Huang, S. Mo, K. Yang, J. Chen, B. Du, J. Xu, Z. Ding, F. Dong, Comparative study of ultrasound attenuation analysis and controlled attenuation parameter in the diagnosis and grading of liver steatosis in non-alcoholic fatty liver disease patients, BMC Gastroenterology, 24(1), (2024) 81. https://doi.org/10.1186/s12876-024-03160-8
  11. S.S. Pandi, V.R. Chiranjeevi, K.T, K.P, (2023) Improvement of Classification Accuracy in Machine Learning Algorithm by Hyper-Parameter Optimization, 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE),IEEE, Chennai, India. https://doi.org/10.1109/RMKMATE59243.2023.10369177
  12. F. Mostafa, E. Hasan, M. Williamson, H. Khan, Statistical machine learning approaches to liver disease prediction. Livers, 1(4), (2021) 294–312, https://doi.org/10.3390/livers1040023
  13. R. Amin, R. Yasmin, S. Ruhi, M.H. Rahman, M.S. Reza, Prediction of chronic liver disease patients using integrated projection based statistical feature extraction with machine learning algorithms, Informatics in Medicine Unlocked, 36, (2023) 101155. https://doi.org/10.1016/j.imu.2022.101155
  14. M. Abdar, M. Zomorodi-Moghadam, R. Das, I.H. Ting, Performance analysis of classification algorithms on early detection of liver disease, Expert Systems. Applications, 67, (2017) 239–251. https://doi.org/10.1016/j.eswa.2016.08.065
  15. J.H. Joloudari, H. Saadatfar, A. Dehzangi, S. Shamshirband, Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection, Inform. Med. Unlocked, 17, (2019) 100255. https://doi.org/10.1016/j.imu.2019.100255
  16. S.M. Ganie, P.K. Dutta Pramanik, Predicting Chronic Liver Disease Using Boosting Technique, 2023 International Conference on Artificial Intelligence for Innovations in Healthcare Industries (ICAIIHI),IEEE,Raipur, India, 1-6. https://doi.org/10.1109/ICAIIHI57871.2023.10489026
  17. Y. Wang, P. Wang, Development and validation of a new diagnostic prediction model for NAFLD based on machine learning algorithms in NHANES 2017-2020.3. Hormones (2025) 1-16. https://doi.org/10.1007/s42000-025-00634-6
  18. H. Saleem, Hepatitis Diagnosis: A Comprehensive Review of Machine Learning Classification Algorithms. The Indonesian Journal of Computer Science, 13(3), (2024).
  19. M. Wang, X. Yan, Y. Dong, X. Li, B. Gao, Machine learning and multi-omics data reveal driver gene-based molecular subtypes in hepatocellular carcinoma for precision treatment. PLoS Computational Biology, 20(5), (2024) e1012113. https://doi.org/10.1371/journal.pcbi.1012113
  20. H. Zhu, Y. Zhou, D. Shen, D.K. Wu, X. Gan, X. Xue, W. Zhang, X. Yang, J. Qiu, D. Sun An interpretable machine learning model for predicting early liver metastasis after pancreatic cancer surgery. BMC Cancer, 25, (2025) 1117. https://doi.org/10.1186/s12885-025-14503-3
  21. X. Wang, Q. Xia, S. Yang, C. Deng, N. Gu, Y. Shen, Z. Wang, B. Shi, R. Zhao, Machine Learning-Based Immuno-Inflammatory Index Integrating Clinical Characteristics for Predicting Coronary Artery Plaque Rupture. Immunity, Inflammation and Disease, 13(4), (2025) e70162. https://doi.org/10.1002/iid3.70162
  22. A. Sheakh, T. ahosin Sazia, T. aminul Islam, R.J. Lima, Improving hepatits C diagnosis using machine learning techniques: An experimental analysis. Artificial Intelligence for Intelligent Systems. CRC Press, (2024) 241-259.
  23. Q. Song, X. He, Y. Wang, H. Gao, L. Tan, J. Ma, L. Kang, P. Han, Y. Luo, K. Wang, (2025). Clinical validation of AI assisted animal ultrasound models for diagnosis of early liver trauma. Scientific Reports, 15(1), 1-9. https://doi.org/10.1038/s41598-025-91900-5
  24. Y. Wu, W. Zhao, L. Zhang, Y. Wang, Y. Wen, L. Liu, (2025). Machine learning models for predicting chemotherapy-induced adverse drug reactions in colorectal cancer patients. Digestive and Liver Disease. https://doi.org/10.1016/j.dld.2025.06.007
  25. Misc rabie_el_kharoua_2024, Predict Liver Disease: 1700 Records Dataset, https://www.kaggle.com/datasets/rabieelkharoua/predict-liver-disease-1700-records-dataset
  26. J. Singha, S. Baggab, R. Kaur, Software-based Prediction of Liver Disease with Feature Selection and Classification Techniques. Procedia Computer Science, Elsevier, 167, (2020) 1970–1980. https://doi.org/10.1016/j.procs.2020.03.226
  27. S. Ballestri, F. Nascimbeni, E. Baldelli, A. Marrazzo, D. Romagnoli, A. Lonardo, NAFLD as a Sexual Dimorphic Disease: Role of Gender and Reproductive Status in the Development and Progression of Nonalcoholic Fatty Liver Disease and Inherent Cardiovascular Risk. Advances in therapy, 34(6), (2017) 1291–1326. https://doi.org/10.1007/s12325-017-0556-1
  28. A. Gramenzi, F. Caputo, M. Biselli, F. Kuria, E. Loggi, P. Andreone, M. Bernardi, Review article: Alcoholic liver disease – pathophysiological aspects and risk factors. Alimentary Pharmacology & Therapeutics, 24(8), (2006) 1151-1161. https://doi.org/10.1111/j.1365-2036.2006.03110.x
  29. R.J. Shephard, N. Johnson, Effects of physical activity upon the liver. European journal of applied physiology, 115, (2015) 1–46. https://doi.org/10.1007/s00421-014-3031-6
  30. A.K. Loomis, S. Kabadi, D. Preiss, C. Hyde, V. Bonato, J. Desai, J.M. Gill, P. Welsh, D. Waterworth, N. Sattar, Body Mass Index and Risk of Nonalcoholic Fatty Liver Disease: Two Electronic Health Record Prospective Studies. The Journal of Clinical Endocrinology & Metabolism, 101(3), (2016) 945-952. https://doi.org/10.1210/jc.2015-3444
  31. S.M. Rutledge, A. Asgharpour, Smoking and Liver Disease. Gastroenterology & Hepatology, 16(12), (2020) 617. https://pmc.ncbi.nlm.nih.gov/articles/PMC8132692/
  32. J.M. Hazlehurst, C. Woods, T. Marjot, J.F. Cobbold, J.W. Tomlinson, Non-alcoholic fatty liver disease and diabetes. Metabolism, 65(8), (2016) 1096-1108. https://doi.org/10.1016/j.metabol.2016.01.001
  33. J.H. Henriksen, S. Møller, Hypertension and liver disease. Current hypertension reports, 6(6), (2004) 453–461. https://doi.org/10.1007/s11906-004-0041-5
  34. A. Paul, D.P. Mukherjee, P. Das, A. Gangopadhyay, A.R. Chintha, S. Kundu, Improved Random Forest for Classification, In IEEE Transactions on Image Processing, 27(8), (2018) 4012-4024. https://doi.org/10.1109/TIP.2018.2834830
  35. A. Cutler, D.R. Cutler, J.R. Stevens, (2012) Random Forests. Ensemble Machine Learning, Springer, New York. https://doi.org/10.1007/978-1-4419-9326-7_5
  36. A.A. Ibrahim, R.L. Ridwan, M.M. Muhammed, R.O. Abdulaziz, G.A. Saheed, Comparison of the CatBoost classifier with other machine learning methods. International Journal of Advanced Computer Science and Applications, 11(11), (2020) 738-748. https://dx.doi.org/10.14569/IJACSA.2020.0111190
  37. A.V. Dorogush, V. Ershov, A. Gulin, (2018). CatBoost: Gradient boosting with categorical features support.ArXiv. https://doi.org/10.48550/arXiv.1810.11363
  38. A. Sanusi, C.A. Putra, F.A. Akbar, Implementation of ADABOOST Algorithm on C50 for Improving the Performance of Liver Disease Classification. JEECS (Journal of Electrical Engineering and Computer Sciences), 8(2), (2023) 93-102. https://doi.org/10.54732/jeecs.v8i2.1
  39. N. Pavitha, S. Sugave, (2023). Comparative Analysis of Classification Models in Design Process of Ensemble Classifier. Information and Communication Technology for Competitive Strategies (ICTCS 2022), Lecture Notes in Networks and Systems, Springer, Singapore, 623, (2023). https://doi.org/10.1007/978-981-19-9638-2_8
  40. R. Angeline, J. Sowmiya, F. Malcom, K. Rachitha, Resume Classification Using LGBM Algorithm with Sentiment Analysis. Resume Classification Using LGBM Algorithm with Sentiment Analysis. In International Conference on Smart Computing and Communication Singapore: Springer Nature Singapore, 383-393. https://doi.org/10.1007/978-981-97-1320-2_31
  41. S.M. Ganie, P.K. Dutta Pramanik, Z. Zhao, Improved liver disease prediction from clinical data through an evaluation of ensemble learning approaches. BMC Medical Informatics and Decision Making, 24(1), (2024)160. https://doi.org/10.1186/s12911-024-02550-y
  42. C.C. Chang, C.J. Lin, Training v-Support Vector Classifiers: Theory and Algorithms, In Neural Computation, 13(9), (2001) 2119-2147. https://doi.org/10.1162/089976601750399335
  43. I. Naglik, M. Lango, (2025) Fine-Tuning Fine-Tuned Models: Towards a Practical Methodology for Sentiment Analysis with Small In-Domain Supervised Dataset. In International Conference on Neural Information Processing, Singapore, Springer Nature Singapore, 1-16. https://doi.org/10.1007/978-981-96-7005-5_1
  44. H. Vu-Ngoc, S.S. Elawady, G.M. Mehyar, A.H. Abdelhamid, O.M. Mattar, O. Halhouli, N.L. Vuong, C.D. Mohd Ali, U.H. Hassan, N.D. Kien, K. Hirayama, N.T. Huy, Quality of flow diagram in systematic review and/or meta-analysis. PLOS ONE, 13(6), (2018) e0195955. https://doi.org/10.1371/journal.pone.0195955