Machine Learning Algorithms for Predicting CKD Progression: A Real-World Hospital Dataset Analysis

Authors

  • Edwina Jospiene M
  • Nimithap S
  • Dr Subhadip Bag
  • Dr. S. Elavarasan
  • Dr. Prolay Ghosh
  • Sathiyamoorthy M
  • S.T. Gopukumar

DOI:

https://doi.org/10.65327/kidneys.v15i1.605

Keywords:

Chronic Kidney Disease, Machine Learning, Random Forest, SHAP, Clinical Decision Support.

Abstract

Background. Chronic Kidney Disease (CKD) is a progressive condition associated with substantial global morbidity and mortality. Early detection remains critical for reducing complications and slowing progression to end-stage kidney disease. Traditional diagnostic approaches depend on laboratory markers that may not fully capture nonlinear interactions among clinical parameters. Machine learning offers promising capabilities for improving early identification and supporting clinical decision-making. Methods. This study developed an end-to-end machine learning framework for CKD prediction using the Early Stage CKD dataset. The workflow included rigorous data preprocessing, exploratory data analysis, and feature engineering prior to model development. A Random Forest classifier was trained using an 80/20 stratified split, and performance was assessed using accuracy, precision, recall, F1-score, confusion matrix, and ROC–AUC. To enhance transparency, SHAP (SHapley Additive exPlanations) analysis was applied to interpret feature contributions and validate clinical relevance. Results. The Random Forest model demonstrated excellent predictive performance, achieving an accuracy of 96.25% and a ROC–AUC of 1.00. The confusion matrix indicated zero false positives and only three false negatives, reflecting strong diagnostic reliability. SHAP analysis identified hemoglobin, serum creatinine, packed cell volume, and specific gravity as the most influential predictors, aligning with established CKD biomarkers. Conclusion. The proposed machine learning framework offers a robust, interpretable approach for early CKD prediction. Its strong performance and explainability make it suitable for integration into real-world clinical decision-support systems, particularly in resource-limited healthcare settings.

 

 

Downloads

Download data is not yet available.

Author Biographies

Edwina Jospiene M

Research Scholar, Department of Biochemistry, Biochemistry/Biotechnology, Regenix Super Speciality Laboratories pvt.ltd., Affiliated to UNIVERSITY OF MADRAS, Chennai 94, Email ID: edwina18@ymail.com, Orcid ID: 0000-0002-6523-6087

Nimithap S

Junior Resident, General Medicine, Department of General Medicine, Saveetha Medical College and Hospitals, ‎Saveetha Institute of Medical and Technical Sciences, Saveetha University, Tamil ‎Nadu, Chennai-602105, India, Email Id: nimithap0001@gmail.com ,
ORCID: 0009-0002-1990-0124

Dr Subhadip Bag

Assitant Professor, Hi-Tech Medical College & Hospital, Rourkela MBBS, MD Community Medicine, Odisha University of Health Sciences, Rourkela-769004, India, Email ID: dr.subhadip07@gmail.com, Orcid ID: 0000-0002-8064-429X

Dr. S. Elavarasan

Associate Professor, Department of Community Medicine, Specialization in Research Methodology & Biostatistics, Sri Sairam Homoeopathy Medical College & Research Center, West Tambaram, Chennai -600 044, Email ID: dr.s.elavarasan@gmail.com,
Orcid : 0000-0001-7317-4309

Dr. Prolay Ghosh

Assistant Professor, Department of Information Technology, JIS College of Engineering Kalyani, Nadia, West Bengal-741235, India,
Email ID:
prolay.ghosh@jiscollege.ac.in, Orcid ID: https://orcid.org/0000-0001-9267-5766

Sathiyamoorthy M

Assistant Professor, Department of Computer Science and Engineering, Computer Science and Engineering; Artificial Intelligence; Machine Learning, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai – 602105, Tamil Nadu, India, Email: sathiyamoorthym.sse@saveetha.com, Orcid ID: https://orcid.org/0009-0002-2190-1230

S.T. Gopukumar

Nanobioinformatics Unit, Helix Research Studio, Department of General Surgery, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai – 602 105, Tamil Nadu, India, Email: gopukumars.smc@saveetha.com , Orcid ID: 0000-0001-8160-2414

 

References

Levey AS, Eckardt KU, Tsukamoto Y, Levin A, Coresh J, Rossert J, Zeeuw DD, Hostetter TH, Lameire N, Eknoyan G. Definition and classification of chronic kidney disease: A position statement from Kidney Disease: Improving Global Outcomes (KDIGO). Kidney Int. 2005;67(6):2089-2100.

Morton R, Webster A, Masson P, Nagler E. Chronic kidney disease. Lancet. (No year provided; please supply year if available.)

Bikbov B, Purcell CA, Levey AS, Smith M, Abdoli A, Abebe M, Adebayo OM, Afarideh M, Agarwal SK, Agudelo-Botero M, Ahmadian E, et al. Global, regional, and national burden of chronic kidney disease, 1990–2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2020;395(10225):709-733.

Kovesdy CP. Epidemiology of chronic kidney disease: An update 2022. Kidney Int Suppl. 2022;12(1):7-11.

Deng L, Guo S, Liu Y, Zhou Y, Liu Y, Zheng X, Yu X, Shuai P. Global, regional, and national burden of chronic kidney disease and its underlying etiologies: GBD Study 2021. BMC Public Health. 2025;25(1):636.

Xie K, Cao H, Ling S, Zhong J, Chen H, Chen P, Huang R. Global burden of chronic kidney disease, 1990–2021: GBD 2021 analysis. Front Endocrinol. 2025;16:1526482.

Ketteler M, Block GA, Evenepoel P, Fukagawa M, Herzog CA, McCann L, Moe SM, Shroff R, Tonelli MA, Toussaint ND, Vervloet MG. Diagnosis, evaluation, prevention, and treatment of CKD–MBD: KDIGO 2017 guideline update. Ann Intern Med. 2018;168(6):422-430.

Zanchi A, Jehle AW, Lamine F, Vogt B, Czerlau C, Bilz S, Seeger H, de Seigneux S. Diabetic kidney disease in type 2 diabetes: Consensus statement of the Swiss Societies of Diabetes and Nephrology. Swiss Med Wkly. 2023;153(1):40004.

Francis A, Harhay MN, Ong ACM, Tummalapalli SL, Ortiz A, Fogo AB, Fliser D, Roy-Chaudhury P, Fontana M, Nangaku M, Wanner C. Chronic kidney disease and the global public health agenda: An international consensus. Nat Rev Nephrol. 2024;20(7):473-485.

US Renal Data System. USRDS Annual Data Report: Atlas of CKD & ESRD in the United States. NIH NIDDK. 2013.

Debal DA, Sitote TM. Chronic kidney disease prediction using machine learning techniques. J Big Data. 2022;9(1):109.

Islam MA, Majumder MZH, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. J Pathol Inform. 2023;14:100189.

Subasi A, Alickovic E, Kevric J. Diagnosis of chronic kidney disease by using random forest. In: CMBEBIH 2017; 2017:589-594. Singapore: Springer.

Pal S. Chronic kidney disease prediction using machine learning techniques. Biomed Mater Devices. 2023;1(1):534-540.

Sanmarchi F, Fanconi C, Golinelli D, Gori D, Hernandez-Boussard T, Capodici A. Predict, diagnose, and treat chronic kidney disease with machine learning: A systematic literature review. J Nephrol. 2023;36(4):1101-1117.

Dritsas E, Trigka M. Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn Comput. 2022;6(3):98.

Mendapara K. Development and evaluation of a chronic kidney disease risk prediction model using random forest. Front Genet. 2024;15:1409755.

Singamsetty S, Ghanta S, Biswas S, Pradhan A. Enhancing machine learning–based forecasting of chronic renal disease with explainable AI. PeerJ Comput Sci. 2024;10:e2291.

Liu P, Liu Y, Liu H, Xiong L, Mei C, Yuan L. A random forest algorithm for assessing CKD risk factors: Observational study. Asian Pac Isl Nurs J. 2024;8:e48378.

Rezk NG, Alshathri S, Sayed A, Hemdan EED. Explainable AI for chronic kidney disease prediction in medical IoT: Integrating GANs and few-shot learning. Bioengineering. 2025;12(4):356.

Rubini LJ, Soundarapandian P, Eswaran P. Early stage chronic kidney disease dataset. UCI Machine Learning Repository. 2015. Available from: https://archive.ics.uci.edu/dataset/336/chronic+kidney+disease

Dritsas E, Trigka M. Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn Comput. 2022;6(3):98.

Debal DA, Sitote TM. Chronic kidney disease prediction using machine learning techniques. J Big Data. 2022;9(1):109.

Subasi A, Alickovic E, Kevric J. Diagnosis of chronic kidney disease by using random forest. In: CMBEBIH 2017; 2017:589-594. Singapore: Springer.

Dritsas E, Trigka M. Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn Comput. 2022;6(3):98.

Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765-4774.Tjoa E, Guan C. A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Trans Neural Netw Learn Syst. 2020;32(11):4793-4813.

Arjaria SK, Rathore AS, Choubey G, Mishra AK. Chronic kidney disease prediction and interpretation using explainable AI. In: International Conference on Machine Intelligence and Smart Systems. 2023:29-44. Cham: Springer.

Downloads

Published

2026-01-23

How to Cite

Edwina Jospiene M, Nimithap S, Dr Subhadip Bag, Dr. S. Elavarasan, Dr. Prolay Ghosh, Sathiyamoorthy M, & S.T. Gopukumar. (2026). Machine Learning Algorithms for Predicting CKD Progression: A Real-World Hospital Dataset Analysis. KIDNEYS, 15(1), 31–42. https://doi.org/10.65327/kidneys.v15i1.605

Issue

Section

Review