-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Hi,
I've noticed that I can get inconsistent results on identical runs of PLN on the same data, even when setting a seed in R. There appear to be two different, but possibly related problems:
- The penalty grid and bootstrapped vhat matrix are not the same when running the same code on the same CPU architecture.
- On other CPU architectures, the results are consistent, but they are not identical to results on different CPU architectures.
I suspect that the bug is happening somewhere inside the C++ code since I am setting a random seed in R and using a single thread.
I ran the following code to reproduce the example vignettes almost exactly:
library(PLNmodels)
set.seed(1)
RhpcBLASctl::omp_set_num_threads(1)
RhpcBLASctl::blas_set_num_threads(1)
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
network_models <- PLNnetwork(Abundance ~ 1 + offset(log(Offset)), data = trichoptera)
myPLN <- PLN(
Abundance ~ 1,
trichoptera,
control = PLNmodels::PLN_param(
config_post = list(
jackknife = FALSE,
bootstrap = 30,
variational_var = FALSE,
sandiwch_var = FALSE,
rsquared = TRUE
)
)
)
vhat <- coef(myPLN, type = "main")
vhat <- methods::as(vhat, "dgCMatrix")
I ran the code five times on each of three devices:
- An AMD Opteron 6380 (about a decade old)
- An AMD EPYC 947VF (very new)
- An Intel Xeon Silver 4316
On device 1, I got 4 runs where the max penalty in network_models$criteria is 3.627 and 1 run where it is 3.681. Notably, none of these match the value 3.643 which is in the vignette. On device 2, all 5 runs have a max penalty of 3.639 which also does not match the vignette. On device 3, all 5 runs match the vignette. I compared the vhat matrices using all.equal() and have similar results.
The consequence is that the best-fit models are not the same because the penalty grids are not the same.
Here is my session info:
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS/LAPACK: FlexiBLAS OPENBLASOPENMP; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/Los_Angeles
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
## [5] purrr_1.0.4 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
## [9] ggplot2_3.5.2 tidyverse_2.0.0 PLNmodels_1.2.2
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.51 bslib_0.9.0
## [4] corrplot_0.95 processx_3.8.6 lattice_0.22-7
## [7] callr_3.7.6 tzdb_0.5.0 vctrs_0.6.5
## [10] tools_4.4.1 ps_1.9.0 generics_0.1.3
## [13] parallel_4.4.1 pkgconfig_2.0.3 torch_0.14.2
## [16] Matrix_1.7-3 RColorBrewer_1.1-3 lifecycle_1.0.4
## [19] compiler_4.4.1 farver_2.1.2 codetools_0.2-20
## [22] argparse_2.2.4 htmltools_0.5.8.1 sass_0.4.10
## [25] yaml_2.3.10 crayon_1.5.3 pillar_1.10.1
## [28] nloptr_2.2.1 jquerylib_0.1.4 MASS_7.3-65
## [31] cachem_1.1.0 glassoFast_1.0.1 parallelly_1.43.0
## [34] tidyselect_1.2.1 digest_0.6.37 stringi_1.8.7
## [37] future_1.40.0 listenv_0.9.1 fastmap_1.2.0
## [40] grid_4.4.1 archive_1.1.12 cli_3.6.4
## [43] magrittr_2.0.3 utf8_1.2.5 dichromat_2.0-0.1
## [46] future.apply_1.11.3 withr_3.0.2 scales_1.4.0
## [49] bit64_4.6.0-1 timechange_0.3.0 rmarkdown_2.29
## [52] globals_0.17.0 igraph_2.1.4 bit_4.5.0.1
## [55] gridExtra_2.3 findpython_1.0.9 hms_1.1.3
## [58] evaluate_1.0.3 knitr_1.49 pscl_1.5.9
## [61] rlang_1.1.6 Rcpp_1.0.14 glue_1.8.0
## [64] coro_1.1.0 vroom_1.6.5 jsonlite_1.9.1
## [67] R6_2.6.1