Test unconfoundedness with optional transport weighting (auto/KS/energy)

Compare a marginal treatment effect ω in an RCT-like dataset with the same estimand in an observational dataset. Supports:

Estimators: IPW or AIPW (doubly robust).
Inference: bootstrap CI for Δ = ω_obs − ω_rct; Wald-style z test.
Transport weighting:
- "none": no reweighting;
- "rct_to_obs": always reweight RCT to OBS covariates via density ratio logistic;
- "auto": run shift tests (KS/energy) and reweight only if shift is detected.

Usage

unconfoundedness_test(
  data_rct,
  data_obs,
  formula,
  estimator = c("aipw", "ipw"),
  stabilize = TRUE,
  trim = c(0.01, 0.99),
  family_y = c("gaussian", "binomial"),
  transport = c("none", "rct_to_obs", "auto"),
  auto_method = c("both", "ks", "energy"),
  auto_alpha = 0.01,
  auto_energy_R = 199L,
  B = 1000L,
  alpha = 0.05,
  seed = NULL
)

Arguments

data_rct, data_obs: data.frame with same Y/A definitions and covariates.
formula: model formula Y ~ A + X1 + X2. A must be binary and the first RHS term.
estimator: "aipw" (default) or "ipw".
stabilize: logical; stabilized IPW (default TRUE).
trim: length-2 numeric in 0,1 for weight trimming (e.g., c(0.01,0.99)); NULL disables.
family_y: "gaussian" or "binomial" (for AIPW outcome regression and checks).
transport: "none", "rct_to_obs", or "auto" (default "none").
auto_method: which tests to use under transport="auto": "both" (default), "ks", or "energy".
auto_alpha: significance level for shift tests (default 0.01).
auto_energy_R: number of permutations for energy test (default 199). Ignored if energy not installed.
B: bootstrap replicates for Δ CI (default 1000).
alpha: CI tail (default 0.05 for 95% CI).
seed: RNG seed or NULL.

Value

list with estimates, CI, p-values, settings, and diagnostics (incl. auto decision).

Examples

if (FALSE) { # \dontrun{
set.seed(1)
gen_rct <- function(n){
  X1 <- rnorm(n); X2 <- rbinom(n,1,0.4)
  A  <- rbinom(n,1,0.5)
  Y0 <- 0.5 + 0.5*X1 + 0.3*X2 + rnorm(n)
  Y1 <- Y0 + 1.0
  data.frame(Y=ifelse(A==1,Y1,Y0),A,X1,X2)
}
gen_obs <- function(n){
  X1 <- rnorm(n, 0.4); X2 <- rbinom(n,1,0.7)  # shift
  A  <- rbinom(n,1,plogis(-0.2 + 0.8*X1 + 0.6*X2))
  U  <- rnorm(n)
  Y0 <- 0.5 + 0.5*X1 + 0.3*X2 + 0.3*U + rnorm(n)
  Y1 <- Y0 + 1.0
  data.frame(Y=ifelse(A==1,Y1,Y0),A,X1,X2)
}
d_rct <- gen_rct(800); d_obs <- gen_obs(2000)
out <- unconfoundedness_test(d_rct, d_obs, Y ~ A + X1 + X2,
  estimator="aipw", family_y="gaussian",
  transport="auto", auto_method="both", auto_alpha=0.01,
  B=500, seed=42)
print(out); out$diagnostics$auto
} # }