This project predicts the type of forest cover using cartographic and environmental features from the Covertype dataset (UCI). By applying tree-based multi-class classification models, it demonstrates how data-driven approaches can support ecological research and resource management.
- Explored dataset distribution and target class balance
- Performed data cleaning and preprocessing (including categorical handling)
- Trained Random Forest and XGBoost models
- Evaluated models using Accuracy, Confusion Matrix, and Feature Importance
- Compared Random Forest vs XGBoost
- Performed hyperparameter tuning for improved performance
- Source: UCI Covertype Dataset
- Features: Cartographic and environmental attributes (elevation, slope, soil type, aspect, etc.)
- Target: Forest cover type (7 classes)
| Model | Accuracy | Precision | Recall | F1-score | Notes |
|---|---|---|---|---|---|
| Random Forest (untuned) | 0.9551 | 0.96 | 0.96 | 0.95 | Strong baseline |
| Random Forest (tuned) | 0.9550 | 0.96 | 0.96 | 0.95 | Minimal gain after tuning |
| XGBoost (untuned) | 0.9533 | 0.95 | 0.95 | 0.95 | Strong initial performance |
| XGBoost (tuned) | 0.9601 | 0.96 | 0.96 | 0.96 | Best model with ~96% accuracy |
✅ XGBoost (tuned) achieved the best performance overall, with the most balanced precision–recall across all forest cover types.
git clone https://github.com/Minahil-Abid/ForestCoverType-Multi-Class-Classification.git
cd ForestCoverType-Multi-Class-Classification
pip install -r requirements.txt
jupyter notebook Forest_CoverType_Classification.ipynbMIT License – free to use and share.