This repository contains R scripts and data for substrate characterization and biological efficiency (EB%) modeling in Pleurotus pulmonarius.
- 01_pipeline_EB.R — Main pipeline:
- Random Forest (Initial vs Consumed composition)
- Decision Trees (pruned ~8 leaves, CV10 OOF)
- Bootstrap stability + heatmaps
- Bland–Altman comparison with new data
- data/ — Example input data (if shared)
- outputs/ — Generated results (PNG, CSV, TXT)
Running 01_pipeline_EB.R will generate:
RF_EB_INICIAL_metrics_split.csv | RF_EB_CONSUMO_metrics_split.csvRF_EB_INICIAL_metrics_cv10.csv | RF_EB_CONSUMO_metrics_cv10.csvRF_EB_INICIAL_importance_bootstrap.csv | RF_EB_CONSUMO_importance_bootstrap.csvRF_EB_INICIAL_heatmap_top2.png | RF_EB_CONSUMO_heatmap_top2.pngDT_EB_INICIAL_8leaves.png | DT_EB_CONSUMO_8leaves.pngDT_EB_INICIAL_rules.txt | DT_EB_CONSUMO_rules.txtPredicciones_EB_nuevos_datos.csvBA_RF_inicial.png | BA_RF_consumo.png | BA_DT_inicial.png | BA_DT_consumo.pngBlandAltman_summary_nuevos_datos.csv
- R >= 4.2
- Packages:
ranger,caret,dplyr,ggplot2,viridisLite,readr,tidyr,rpart,rpart.plot,scales
Clone the repo and run the pipeline:
## How to run
1) Install dependencies and run the full pipeline:
```r
# from the repo root in R/RStudio
source("scripts/main.R")
source("01_pipeline_EB.R")