Libraries: scikit-learn, matplotlib, seaborn, XGBoost, TensorFlow, SHAP
Dataset: Cell2Cell Churn Dataset 
 
In this project, we explore the Cell2Cell dataset and build three supervised machine learning models - Decision Tree, XGBoost, and a Neural Network (TensorFlow) โ to predict customer churn risk. The workflow includes feature engineering, model tuning, performance evaluation, and SHAP-based interpretability. 
- Feature engineering with tenure, billing, call usage, and derived behavioural indicators
 - Categorical Optimisation via native support in XGBoost and embedding layers in the neural network
 - 
Model Evaluation using accuracy, precision, recall, and 
$F_1$ score on both validation and test sets - Global Interpretability through SHAP to compare feature importance across architectures
 - 
Confusion Matrix Analysis to diagnose prediction trade-offs and guide threshold tuning 
 
- 
XGBoost emerged as the top-performing model, with validation recall of 75.1% and a stable 
$F_1$ score of 0.495 on test data - Neural Network achieved higher validation accuracy but struggled with recall, indicating overfitting to the majority class
 - SHAP analysis revealed consistently influential features across models
 - Confusion matrix analysis highlighted a recall-focused strategy: most churners were correctly flagged, though false positives remained substantial
 
- Refine the Neural Network with dropout, class weighting, and early stopping to improve recall
 - Extend SHAP analysis
 

