Skip to content

Edits round 1#12

Open
tgerke wants to merge 6 commits into
stopsack:devfrom
tgerke:t-edits
Open

Edits round 1#12
tgerke wants to merge 6 commits into
stopsack:devfrom
tgerke:t-edits

Conversation

@tgerke
Copy link
Copy Markdown
Contributor

@tgerke tgerke commented May 28, 2023

Generally making way through the code base. I'll open the PR and do this iteratively, so that you can review changes per file as we go.

@stopsack stopsack changed the base branch from master to dev June 2, 2023 12:07
@stopsack
Copy link
Copy Markdown
Owner

stopsack commented Jun 2, 2023

Changed base branch to dev, which is now up to speed with master again.

Copy link
Copy Markdown
Owner

@stopsack stopsack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @tgerke - thanks a lot for the initial rounds of edits! See comments inline.

Comment thread vignettes/margstd.Rmd
library(risks) # provides riskratio(), riskdiff(), postestimation functions
library(dplyr) # For data handling
library(broom) # For tidy() model summaries
data(breastcancer)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say somewhere what package breastcancer comes from?

Comment thread vignettes/models.Rmd
```{r allmodels2}
tidy(fit_all) %>%
select(-statistic, -p.value) %>%
tidy(fit_all) |>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about introducing a dependency on R 4.1. I still see scientists use 4.0 and earlier.

Comment thread R/caseduplicate.R
data <- data |>
dplyr::mutate(.clusterid = dplyr::row_number()) |>
dplyr::rename(outc = dplyr::one_of(!!yvar)) |>
tidyr::uncount(outc + 1) |>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The see the elegance in this approach! However, in the spirit of my comment in #10, it would be nice not to rely on tidyverse functions down the road. Let us focus on testing and bug fixes for now.

Comment thread R/caseduplicate.R
dplyr::mutate(outc = 0) %>%
dplyr::rename(!!yvar := "outc"))

data <- data |>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see earlier comment about base R pipe

Comment thread R/caseduplicate.R
yvar <- as.character(all.vars(formula)[1])

#TODO the following assumes that outc is coded as 0/1. We should throw an
# error when this is not true
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%! How about here:

risks/R/estimate_risk.R

Lines 216 to 217 in f48f0a0

...) {
implausible <- 0.99999

Comment thread R/estimate_risk.R
...
)
#TODO seeing a pattern: let's make returning a non-converged object
# an internal function that can be used in all cases like this
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - such a function just exists under different names because it returns slightly different object types depending on what model was supposed to be fit.

# Exception handlers
possibly_estimate_poisson <- ext_possibly(
.f = estimate_poisson,
otherwise = return_failure(family = list(family = "poisson"),
classname = "robpoisson"))
possibly_estimate_duplicate <- ext_possibly(
.f = estimate_duplicate,
otherwise = return_failure(family = list(family = "binomial"),
classname = "duplicate"))
possibly_estimate_glm <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = NULL))
possibly_estimate_glm_startp <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = "glm_startp"))
possibly_estimate_glm_startd <- ext_possibly(
.f = estimate_glm,
otherwise = return_failure(family = list(family = "binomial"),
classname = "glm_startd"))
possibly_estimate_logbin <- ext_possibly(
.f = estimate_logbin,
otherwise = return_failure(family = list(family = "binomial", link = "log"),
classname = "logbin"))
possibly_estimate_addreg <- ext_possibly(
.f = estimate_addreg,
otherwise = return_failure(family = list(family = "binomial", link = "identity"),
classname = "addreg"))
possibly_estimate_logistic <- ext_possibly(
.f = estimate_logistic,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "logistic"))
possibly_estimate_margstd_boot <- ext_possibly(
.f = estimate_margstd_boot,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "margstd_boot"))
possibly_estimate_margstd_delta <- ext_possibly(
.f = estimate_margstd_delta,
otherwise = return_failure(family = list(family = "binomial", link = "logit"),
classname = "margstd_delta"))

Comment thread R/estimate_risk.R
# TODO the below else is a great example of why they are best avoided
# in favor of explicit "if" conditions. I needed to jump way up to find
# that this is the "else" that happens when estimand is not equal to "rr".
# What else can it be? Let's just spell it clearly here instead of else.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Either "rr" or "rd".

Comment thread R/estimate_risk.R
)
}

#TODO I think duplicate is only valid for RRs? If so we should add a check
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happening here already:

risks/R/estimate_risk.R

Lines 182 to 195 in f48f0a0

riskdiff <- function(
formula,
data,
approach = c(
"auto",
"all",
"robpoisson",
"glm",
"glm_startp",
"glm_cem",
"glm_cem_startp",
"margstd_boot",
"margstd_delta",
"legacy"),

and

risks/R/estimate_risk.R

Lines 223 to 234 in f48f0a0

if(link == "log")
possible_approaches <- as.character(as.list(
args(risks::riskratio))$approach)[-1]
else
possible_approaches <- as.character(as.list(
args(risks::riskdiff))$approach)[-1]
if(!(approach[1] %in% possible_approaches))
stop(paste0(
"Approach '", approach[1], "' is not implemented. ",
"Available are: ",
paste(possible_approaches, sep = ", ", collapse = ", "),
"."))

Comment thread R/estimate_risk.R
#TODO I think duplicate is only valid for RRs? If so we should add a check
if(approach[1] == "duplicate") {
#TODO I think this is supposed to be assigning to the object "fit" or did I
# lose something in the refactor?
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

risks/R/estimate_risk.R

Lines 236 to 237 in f48f0a0

fit <- switch(
EXPR = approach[1],

Comment thread R/estimate_risk.R
if(approach[1] == "glm_cem") {
#TODO why not just require logbin and addreg as imports? They seem pretty
# fundamental to key functions in this package. Then these error messages could
# be removed
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants