Skip to content

Error in Overall Weights Calculation and Logistic Regression Error in "level" + "eventstudy" #11

Description

@nina-straub

Hi there,

First of all, many thanks for this exciting extension of the DiD framework to continuous treatment, and for providing the contdid package!

I am currently trying to implement cont_did in a staggered adoption setup with a continuous treatment where realised doses for treated units range roughly from 1 to 94. In the process, I encountered two issues that might be worth flagging.


Issue 1: Error in Overall Weights Calculation

Error message:

Error in overall_weights(att_gt, ...) : something's going wrong calculating overall weights

Occurs in:

  • target_parameter = "slope" with aggregation = "dose"
  • target_parameter = "level" with aggregation = "dose"

All other parameters were left at their default values.

Note:
This may be related to numerical instability (e.g., floating point precision) in the sum(out_weight) != 1 condition. The issue has already been discussed and illustrated with an example in this previous report: #6


Issue 2: Logistic Regression Error in "level" + "eventstudy"

Error message:

Error in eval(family$initialize) : y values must be 0 <= y <= 1

Occurs in:

  • target_parameter = "level"
  • aggregation = "eventstudy"

Note:
This configuration is the only one calling ptetools::did_attgt internally, which made be think that it is related to DRDID::drdid_panel trying to estimate the propensity score with binary GLM on a treatment-dose not in the [0, 1] intervall.

Minimal Reproducible Example:
Indeed, I was able to reproduce the error by modifying the simulate_contdid_data function such that the treatment dose variable can sample values above 1, and subsequently running the package example with this modyfied setup:

# Change D <- runif(n, 0, 1) to D <- runif(n, 0, 1.1)
sim_contdid_dat_changed <- function(
    n = 5000,
    num_time_periods = 4,
    num_groups = num_time_periods,
    pg = rep(1 / num_groups, num_groups - 1),
    pu = 1 / (num_groups),
    dose_linear_effect = 0,
    dose_quadratic_effect = 0) {
  if (!requireNamespace("tidyr", quietly = TRUE)) {
    stop("Package 'tidyr' is required for this function but is not installed.
         Please install it with install.packages('tidyr').", call. = FALSE)
  }
  time_periods <- 1:num_time_periods
  groups <- c(0, time_periods[-1])
  p <- c(pu, pg)
  G <- sample(groups, n, replace = TRUE, prob = p)
  D <- runif(n, 0, 1.1)
  eta <- rnorm(n, mean = G)
  time_effects <- 1:num_time_periods
  Y0t <- sapply(1:num_time_periods, function(tp) {
    time_effects[tp] + eta + rnorm(n)
  })
  Y1t <- sapply(1:num_time_periods, function(tp) {
    dose_linear_effect * D + dose_quadratic_effect * D^2 + time_effects[tp] + eta + rnorm(n)
  })
  post_mat <- sapply(1:num_time_periods, function(tp) {
    1 * ((G <= tp) & G != 0)
  })
  Y <- post_mat * Y1t + (1 - post_mat) * Y0t
  df <- as.data.frame(Y)
  colnames(df) <- paste0("Y_", 1:num_time_periods)
  df$id <- 1:n
  df$G <- G
  df$D <- D
  df2 <- tidyr::pivot_longer(df,
                             cols = tidyr::starts_with("Y"),
                             names_to = "time_period",
                             names_prefix = "Y_",
                             names_transform = list(time_period = as.numeric),
                             values_to = "Y"
  ) |> as.data.frame()
  df2$D[df2$G == 0] <- 0
  df2
}

set.seed(1234)
# Simulate data (same setup as in example)
df <- sim_contdid_dat_changed(
  n = 5000,
  num_time_periods = 4,
  num_groups = 4,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)

# Try to estimate level - eventstudy setting (same as in example)
cd_res <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df,
  gname = "G",
  target_parameter = "level",
  aggregation = "eventstudy",
  treatment_type = "continuous",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

This leads to the error above.
However, rescaling the treatment back into [0,1] resolves the issue:

# Rescale D
df <- df %>% mutate(D_bin = D / (1 + D))

# Re-run cont_did with level + eventstudy
cd_res <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D_bin",
  data = df,
  gname = "G",
  target_parameter = "level",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

After this transformation, the function runs again without errors. At least deriving from this, it seems like in the "level" + "eventstudy" setup, only doses between 0 and 1 are supported.

I apologise if this is a known limitation or if I've missed something in the documentation - in that case, a small note on the expected scale of the dose variable would be very helpful for future users.


Minor Code Detail


In any case, I hope this is of some use, and thanks again for all the work that has gone into this package!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions