An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction
Wijaya, Richard and Saeed, Faisal and Samimi, Parnia and Albarrak, Abdullah M. and Qasem, Sultan Noman (2024) An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction. Bioengineering, 11 (7). p. 672. ISSN 2306-5354
Preview |
Text
bioengineering-11-00672-v2.pdf - Published Version Available under License Creative Commons Attribution. Download (8MB) |
Abstract
Stroke poses a significant health threat, affecting millions annually. Early and precise prediction is crucial to providing effective preventive healthcare interventions. This study applied an ensemble machine learning and data mining approach to enhance the effectiveness of stroke prediction. By employing the cross-industry standard process for data mining (CRISP-DM) methodology, various techniques, including random forest, ExtraTrees, XGBoost, artificial neural network (ANN), and genetic algorithm with ANN (GANN) were applied on two benchmark datasets to predict stroke based on several parameters, such as gender, age, various diseases, smoking status, BMI, HighCol, physical activity, hypertension, heart disease, lifestyle, and others. Due to dataset imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied to the datasets. Hyperparameter tuning optimized the models via grid search and randomized search cross-validation. The evaluation metrics included accuracy, precision, recall, F1-score, and area under the curve (AUC). The experimental results show that the ensemble ExtraTrees classifier achieved the highest accuracy (98.24%) and AUC (98.24%). Random forest also performed well, achieving 98.03% in both accuracy and AUC. Comparisons with state-of-the-art stroke prediction methods revealed that the proposed approach demonstrates superior performance, indicating its potential as a promising method for stroke prediction and offering substantial benefits to healthcare.
Item Type: | Article |
---|---|
Identification Number: | 10.3390/bioengineering11070672 |
Dates: | Date Event 20 June 2024 Accepted 2 July 2024 Published Online |
Uncontrolled Keywords: | stroke, prediction model, machine learning, ensemble learning |
Subjects: | CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science |
Divisions: | Faculty of Computing, Engineering and the Built Environment > College of Computing |
Depositing User: | Gemma Tonks |
Date Deposited: | 12 Aug 2024 13:26 |
Last Modified: | 12 Aug 2024 13:26 |
URI: | https://www.open-access.bcu.ac.uk/id/eprint/15706 |
Actions (login required)
![]() |
View Item |