A research project comparing the impact of misspecified delay distributions on epidemic forecasting using EpiNow2. This repository contains simulation studies and real-world case studies examining how incorrect assumptions about generation intervals, incubation periods, and reporting delays affect real-time reproduction number (Rt) estimates and case forecasts.
This project investigates how misspecification of epidemiological delay distributions affects forecast accuracy across different epidemic scenarios. The analysis uses the EpiNow2 R package to:
- Simulate epidemic data with known delay distributions
- Estimate infections and Rt under various delay misspecification scenarios
- Compare forecast performance across different diseases and epidemic trajectories
- Evaluate the impact of prior weighting on delay estimation
delayscompare/
├── R/ # Core functions
│ ├── funcs_data.R # Data processing functions
│ ├── funcs_plots.R # Plotting functions
│ ├── funcs_rtraj.R # R trajectory simulation functions
│ ├── generate_scores_func.R # Scoring metrics
│ └── scenario_loop.R # Main scenario simulation wrapper
├── scripts/ # Analysis scripts
│ ├── 01_packages.R # Package dependencies
│ ├── *_simulatedata.R # Data simulation scripts
│ ├── *_scenariorun.R # Scenario execution scripts
│ └── 07_resultsprocessing_*.R # Results processing
├── data/ # Input data files
├── results/ # Analysis outputs
├── scenario_loop.R # Main scenario functions
├── plots_baseline.R # Baseline plotting script
└── plot_runtimes.R # Runtime analysis
The analysis includes three epidemic case studies:
-
COVID-19 (England, 2021 Delta wave)
-
Ebola (Sierra Leone, 2014)
-
Cholera (Yemen, 2016)
Each case study examines delay misspecification scenarios:
- No delay: All delays set to zero (Fixed(0))
- Very low: 0.25× correct delay mean
- Low: 0.8× correct delay mean
- Correct: 1× correct delay mean (ground truth)
- High: 1.25× correct delay mean
- Very high: 2× correct delay mean
Additional scenarios tested:
- Different Rt projection methods (
rt_opts: "latest" vs "project") - Under-reporting (observation scale: 0.3 vs 1.0)
- Prior weight specifications for delay distributions
- EpiNow2
- incidence2
- readxl
- dplyr, tidyr, purrr
- here
- ggplot2
- scoringutils
- viridis, RColorBrewer, cowplot
Install packages with:
pkgs <- c("EpiNow2", "incidence2", "readxl", "dplyr", "tidyr",
"purrr", "here", "ggplot2", "scoringutils", "viridis",
"RColorBrewer", "cowplot")
install.packages(setdiff(pkgs, rownames(installed.packages())),
repos = c("https://epiforecasts.r-universe.dev",
getOption("repos")))Basic scenario execution:
source("scripts/01_packages.R")
source("R/scenario_loop.R")
# Run COVID-19 scenario (example)
res <- sim_scenarios(
case_data = covid_data,
var = 1, # Generation time scenario
inc = 4, # Incubation period scenario (1-6)
gen_mean = 3.6,
gen_sd = 3.1,
gen_max = 15,
inc_mean = 5.2,
inc_sd = 1.52,
inc_max = 21,
rep_mean = 4.4,
rep_sd = 5.6,
rep_max = 18,
freq_fc = 4, # Forecast frequency (every 4 weeks)
weeks_inc = 12, # Use 12 weeks of data
rt_opts_choice = "latest",
obs_scale = 1
)For uncertainty in delay distributions:
res <- sim_weightprior(
case_data = data,
var = 1,
gen_mean_mean = 3.6,
gen_mean_sd = 0.5,
gen_sd_mean = 3.1,
gen_sd_sd = 0.3,
gen_max = 30,
# ... other parameters
weight_prior = TRUE
)Scripts accept command line arguments for parallel execution:
Rscript scripts/casestudy_covid_scenariorun.R 1Results are saved as RDS files in the results/ directory:
*_samples*.rds: Posterior samples for case predictions*_R*.rds: Posterior samples for Rt estimates*_id*.rds: Scenario identifiers*_summary*.rds: Summary statistics*_warnings*.rds: Model warnings
Plots are generated showing:
- Impact of delay misspecification on CRPS for Rt and case forecasts
- Timeseries of best- and worst-performing forecasts
- Runtime comparisons
The analysis uses 8 timepoints per case study to balance computational efficiency with temporal coverage. Timepoints are spaced every 4 weeks throughout the epidemic period.
MCMC settings:
- Samples: 3000
- Adapt delta: 0.99
- Max treedepth: 20
- Forecast horizon: 14 days
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.