Edits round 1 by tgerke · Pull Request #12 · stopsack/risks

tgerke · 2023-05-28T18:12:59Z

Generally making way through the code base. I'll open the PR and do this iteratively, so that you can review changes per file as we go.

stopsack · 2023-06-02T12:08:51Z

Changed base branch to dev, which is now up to speed with master again.

stopsack

hi @tgerke - thanks a lot for the initial rounds of edits! See comments inline.

stopsack · 2023-06-02T12:10:48Z

 library(risks)  # provides riskratio(), riskdiff(), postestimation functions
 library(dplyr)  # For data handling
 library(broom)  # For tidy() model summaries
-data(breastcancer)


Should we say somewhere what package breastcancer comes from?

stopsack · 2023-06-02T12:12:16Z

 ```{r allmodels2}
-tidy(fit_all) %>%
-  select(-statistic, -p.value) %>%
+tidy(fit_all) |> 


I am not sure about introducing a dependency on R 4.1. I still see scientists use 4.0 and earlier.

stopsack · 2023-06-02T12:15:59Z

+  data <- data |>
+    dplyr::mutate(.clusterid = dplyr::row_number()) |>
+    dplyr::rename(outc = dplyr::one_of(!!yvar)) |>
+    tidyr::uncount(outc + 1) |>


The see the elegance in this approach! However, in the spirit of my comment in #10, it would be nice not to rely on tidyverse functions down the road. Let us focus on testing and bug fixes for now.

stopsack · 2023-06-02T12:16:14Z

-                             dplyr::mutate(outc = 0) %>%
-                             dplyr::rename(!!yvar := "outc"))
+
+  data <- data |>


see earlier comment about base R pipe

stopsack · 2023-06-02T12:19:04Z

  yvar <- as.character(all.vars(formula)[1])

+  #TODO the following assumes that outc is coded as 0/1. We should throw an
+  # error when this is not true


100%! How about here:

risks/R/estimate_risk.R

Lines 216 to 217 in f48f0a0

...) {

implausible <- 0.99999

stopsack · 2023-06-02T12:30:21Z

+          ...
+        )
+      #TODO seeing a pattern: let's make returning a non-converged object
+      # an internal function that can be used in all cases like this


Right - such a function just exists under different names because it returns slightly different object types depending on what model was supposed to be fit.

risks/R/zestimate_risk-utils.R

Lines 68 to 117 in f48f0a0

# Exception handlers

possibly_estimate_poisson <- ext_possibly(

.f = estimate_poisson,

otherwise = return_failure(family = list(family = "poisson"),

classname = "robpoisson"))

possibly_estimate_duplicate <- ext_possibly(

.f = estimate_duplicate,

otherwise = return_failure(family = list(family = "binomial"),

classname = "duplicate"))

possibly_estimate_glm <- ext_possibly(

.f = estimate_glm,

otherwise = return_failure(family = list(family = "binomial"),

classname = NULL))

possibly_estimate_glm_startp <- ext_possibly(

.f = estimate_glm,

otherwise = return_failure(family = list(family = "binomial"),

classname = "glm_startp"))

possibly_estimate_glm_startd <- ext_possibly(

.f = estimate_glm,

otherwise = return_failure(family = list(family = "binomial"),

classname = "glm_startd"))

possibly_estimate_logbin <- ext_possibly(

.f = estimate_logbin,

otherwise = return_failure(family = list(family = "binomial", link = "log"),

classname = "logbin"))

possibly_estimate_addreg <- ext_possibly(

.f = estimate_addreg,

otherwise = return_failure(family = list(family = "binomial", link = "identity"),

classname = "addreg"))

possibly_estimate_logistic <- ext_possibly(

.f = estimate_logistic,

otherwise = return_failure(family = list(family = "binomial", link = "logit"),

classname = "logistic"))

possibly_estimate_margstd_boot <- ext_possibly(

.f = estimate_margstd_boot,

otherwise = return_failure(family = list(family = "binomial", link = "logit"),

classname = "margstd_boot"))

possibly_estimate_margstd_delta <- ext_possibly(

.f = estimate_margstd_delta,

otherwise = return_failure(family = list(family = "binomial", link = "logit"),

classname = "margstd_delta"))

stopsack · 2023-06-02T12:31:22Z

+    # TODO the below else is a great example of why they are best avoided
+    # in favor of explicit "if" conditions. I needed to jump way up to find
+    # that this is the "else" that happens when estimand is not equal to "rr".
+    # What else can it be? Let's just spell it clearly here instead of else.


Good point. Either "rr" or "rd".

stopsack · 2023-06-02T12:32:36Z

+    )
+  }
+
+  #TODO I think duplicate is only valid for RRs? If so we should add a check


Happening here already:

risks/R/estimate_risk.R

Lines 182 to 195 in f48f0a0

riskdiff <- function(

formula,

data,

approach = c(

"auto",

"all",

"robpoisson",

"glm",

"glm_startp",

"glm_cem",

"glm_cem_startp",

"margstd_boot",

"margstd_delta",

"legacy"),

and

risks/R/estimate_risk.R

Lines 223 to 234 in f48f0a0

if(link == "log")

possible_approaches <- as.character(as.list(

args(risks::riskratio))$approach)[-1]

else

possible_approaches <- as.character(as.list(

args(risks::riskdiff))$approach)[-1]

if(!(approach[1] %in% possible_approaches))

stop(paste0(

"Approach '", approach[1], "' is not implemented. ",

"Available are: ",

paste(possible_approaches, sep = ", ", collapse = ", "),

"."))

stopsack · 2023-06-02T12:34:01Z

+  #TODO I think duplicate is only valid for RRs? If so we should add a check
+  if(approach[1] == "duplicate") {
+    #TODO I think this is supposed to be assigning to the object "fit" or did I
+    # lose something in the refactor?


risks/R/estimate_risk.R

Lines 236 to 237 in f48f0a0

fit <- switch(

EXPR = approach[1],

stopsack · 2023-06-02T12:35:19Z

+  if(approach[1] == "glm_cem") {
+#TODO why not just require logbin and addreg as imports? They seem pretty
+# fundamental to key functions in this package. Then these error messages could
+# be removed


tgerke added 5 commits May 28, 2023 09:24

rm .DS_Store

da7b11a

light edits and linting

a4986fd

wrangle duplicate cases in a single call

8176a5f

lint, minor edits

92dea8a

big refactor; move most switch and if/else to explicit if blocks

720df64

stopsack changed the base branch from master to dev June 2, 2023 12:07

stopsack requested changes Jun 2, 2023

View reviewed changes

fixes stopsack#17

02656d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edits round 1#12

Edits round 1#12
tgerke wants to merge 6 commits into
stopsack:devfrom
tgerke:t-edits

tgerke commented May 28, 2023

Uh oh!

stopsack commented Jun 2, 2023

Uh oh!

stopsack left a comment

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

stopsack Jun 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# Exception handlers
	possibly_estimate_poisson <- ext_possibly(
	.f = estimate_poisson,
	otherwise = return_failure(family = list(family = "poisson"),
	classname = "robpoisson"))

	possibly_estimate_duplicate <- ext_possibly(
	.f = estimate_duplicate,
	otherwise = return_failure(family = list(family = "binomial"),
	classname = "duplicate"))

	possibly_estimate_glm <- ext_possibly(
	.f = estimate_glm,
	otherwise = return_failure(family = list(family = "binomial"),
	classname = NULL))

	possibly_estimate_glm_startp <- ext_possibly(
	.f = estimate_glm,
	otherwise = return_failure(family = list(family = "binomial"),
	classname = "glm_startp"))

	possibly_estimate_glm_startd <- ext_possibly(
	.f = estimate_glm,
	otherwise = return_failure(family = list(family = "binomial"),
	classname = "glm_startd"))

	possibly_estimate_logbin <- ext_possibly(
	.f = estimate_logbin,
	otherwise = return_failure(family = list(family = "binomial", link = "log"),
	classname = "logbin"))

	possibly_estimate_addreg <- ext_possibly(
	.f = estimate_addreg,
	otherwise = return_failure(family = list(family = "binomial", link = "identity"),
	classname = "addreg"))

	possibly_estimate_logistic <- ext_possibly(
	.f = estimate_logistic,
	otherwise = return_failure(family = list(family = "binomial", link = "logit"),
	classname = "logistic"))

	possibly_estimate_margstd_boot <- ext_possibly(
	.f = estimate_margstd_boot,
	otherwise = return_failure(family = list(family = "binomial", link = "logit"),
	classname = "margstd_boot"))

	possibly_estimate_margstd_delta <- ext_possibly(
	.f = estimate_margstd_delta,
	otherwise = return_failure(family = list(family = "binomial", link = "logit"),
	classname = "margstd_delta"))

	riskdiff <- function(
	formula,
	data,
	approach = c(
	"auto",
	"all",
	"robpoisson",
	"glm",
	"glm_startp",
	"glm_cem",
	"glm_cem_startp",
	"margstd_boot",
	"margstd_delta",
	"legacy"),

	if(link == "log")
	possible_approaches <- as.character(as.list(
	args(risks::riskratio))$approach)[-1]
	else
	possible_approaches <- as.character(as.list(
	args(risks::riskdiff))$approach)[-1]
	if(!(approach[1] %in% possible_approaches))
	stop(paste0(
	"Approach '", approach[1], "' is not implemented. ",
	"Available are: ",
	paste(possible_approaches, sep = ", ", collapse = ", "),
	"."))

Conversation

tgerke commented May 28, 2023

Uh oh!

stopsack commented Jun 2, 2023

Uh oh!

stopsack left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants