Gradio

🫀

About this demo

Predict heart disease risk from patient data with optimized ML models trained on the Cleveland dataset.

Dataset: Cleveland Heart Disease · Models: Decision Tree, k-NN, Naive Bayes, Random Forest, AdaBoost, Gradient Boosting, XGBoost

⚠️

Educational Use Only

This interactive heart disease prediction demo is provided strictly for educational purposes. It is not intended for clinical use and must not be relied upon for medical advice, diagnosis, treatment, or decision-making. Always consult a qualified healthcare professional.

🫀 How to Use: Enter patient features → Run prediction → View ensemble results!

📋 Notes

Models are trained at launch on data/cleveland.csv with customizable train/validation split (default 80/20).
Target is binarized automatically (0 = no disease, >0 = disease).
Retrain functionality: Adjust the split ratio and click "🔄 Retrain Models" to see how data size affects performance.
Seven optimized models are compared: Decision Tree, k-NN, Naive Bayes, Random Forest, AdaBoost, Gradient Boosting, and XGBoost.
Hyperparameters are optimized for heart disease prediction tasks using best practices.
Ensemble uses weighted soft voting with optimized weights based on model performance.
Best performing model on test set is highlighted with 🏆 in the validation metrics table.
Optimization highlights:
- Decision Tree: entropy criterion, balanced classes, optimal depth
- k-NN: distance weighting, Manhattan metric, optimized neighbors
- Random Forest: 200 trees, class balancing, feature sampling
- Gradient Boosting: regularization, subsampling, lower learning rate
- AdaBoost: SAMME algorithm, increased estimators
- XGBoost: L1/L2 regularization, optimal depth and learning rate
Feature descriptions:
- age: Patient age in years
- sex: Gender (0=female, 1=male)
- cp: Chest pain type (1-4)
- trestbps: Resting blood pressure (mmHg)
- chol: Serum cholesterol (mg/dl)
- fbs: Fasting blood sugar >120 mg/dl (1=true, 0=false)
- restecg: Resting ECG results (0-2)
- thalach: Maximum heart rate achieved
- exang: Exercise induced angina (1=yes, 0=no)
- oldpeak: ST depression induced by exercise
- slope: Slope of peak exercise ST segment (1-3)
- ca: Number of major vessels colored by fluoroscopy (0-3)
- thal: Thalassemia (3=normal, 6=fixed defect, 7=reversible defect)

Heart Disease Diagnosis Project

AIO2025: Module 03.

🫀 How to Use: Enter patient features → Run prediction → View ensemble results!

📈 Model Predictions

📋 Notes