Horvitz-Thompson estimator of treatment effects

horvitz_thompson(formula, data, condition_prs, blocks, clusters,
condition_pr_mat = NULL, declaration = NULL, subset,
se_type = c("youngs", "constant"), collapsed = FALSE, alpha = 0.05,
condition1 = NULL, condition2 = NULL)

Arguments

formula An object of class "formula", such as Y ~ Z A data.frame. An optional bare (unquoted) name of the variable with the condition 2 (treatment) probabilities. An optional bare (unquoted) name of the block variable. Use for blocked designs only. An optional bare (unquoted) name of the variable that corresponds to the clusters in the data; used for cluster randomized designs. For blocked designs, clusters must be within blocks. An optional 2n * 2n matrix of marginal and joint probabilities of all units in condition1 and condition2, can be used in place of condition_prs. See details. An object of class "ra_declaration", from the randomizr package that is an alternative way of specifying the design. Cannot be used along with any of condition_prs, blocks, clusters, or condition_pr_mat. See details. An optional bare (unquoted) expression specifying a subset of observations to be used. can be one of c("youngs", "constant") and correspond's to estimating the standard errors using Young's inequality (default, conservative), or the constant effects assumption. A boolean used to collapse clusters to their cluster totals for variance estimation, FALSE by default. The significance level, 0.05 by default. names of the conditions to be compared. Effects are estimated with condition1 as control and condition2 as treatment. If unspecified, condition1 is the "first" condition and condition2 is the "second" according to r defaults. names of the conditions to be compared. Effects are estimated with condition1 as control and condition2 as treatment. If unspecified, condition1 is the "first" condition and condition2 is the "second" according to r defaults.

Details

This function implements the Horvitz-Thompson estimator for treatment effects.

Examples


# Set seed
set.seed(42)

# Simulate data
n <- 10
dat <- data.frame(y = rnorm(n))

#----------
# Simple random assignment
#----------
dat$p <- 0.5 dat$z <- rbinom(n, size = 1, prob = dat$p) # If you only pass condition_prs, we assume simple random sampling horvitz_thompson(y ~ z, data = dat, condition_prs = p)#> coefficient_name est se p ci_lower ci_upper df #> 1 z -0.2532128 0.609167 0.6885769 -1.657954 1.151529 8# Assume constant effects instead horvitz_thompson(y ~ z, data = dat, condition_prs = p, se_type = "constant")#> coefficient_name est se p ci_lower ci_upper df #> 1 z -0.2532128 0.6038814 0.6860232 -1.645766 1.13934 8 # Also can use randomizr to pass a declaration srs_declaration <- randomizr::declare_ra(N = nrow(dat), prob = 0.5, simple = TRUE) horvitz_thompson(y ~ z, data = dat, declaration = srs_declaration)#> coefficient_name est se p ci_lower ci_upper df #> 1 z -0.2532128 0.609167 0.6885769 -1.657954 1.151529 8 #---------- # Complete random assignemtn #---------- dat$z <- sample(rep(0:1, each = n/2))
# Can use a declaration
crs_declaration <- randomizr::declare_ra(N = nrow(dat), prob = 0.5, simple = FALSE)
horvitz_thompson(y ~ z, data = dat, declaration = crs_declaration)#>   coefficient_name       est        se         p  ci_lower ci_upper df
#> 1                z -0.247794 0.5729701 0.6768192 -1.569065 1.073477  8# Can precompute condition_pr_mat and pass it
# (faster for multiple runs with same condition probability matrix)
crs_pr_mat <- declaration_to_condition_pr_mat(crs_declaration)
horvitz_thompson(y ~ z, data = dat, condition_pr_mat = crs_pr_mat)#>   coefficient_name       est        se         p  ci_lower ci_upper df
#> 1                z -0.247794 0.5729701 0.6768192 -1.569065 1.073477  8
#----------
# More complicated assignment
#----------

# arbitrary permutation matrix
possible_treats <- cbind(
c(1, 1, 0, 1, 0, 0, 0, 1, 1, 0),
c(0, 1, 1, 0, 1, 1, 0, 1, 0, 1),
c(1, 0, 1, 1, 1, 1, 1, 0, 0, 0)
)
arb_pr_mat <- permutations_to_condition_pr_mat(possible_treats)
# Simulating a column to be realized treatment
dat$z <- possible_treats[, sample(ncol(possible_treats), size = 1)] horvitz_thompson(y ~ z, data = dat, condition_pr_mat = arb_pr_mat)#> coefficient_name est se p ci_lower ci_upper df #> 1 z -1.368389 1.033353 0.2220105 -3.751306 1.014527 8 # Clustered treatment, complete random assigment # Simulating data dat$cl <- rep(1:4, times = c(2, 2, 3, 3))
clust_crs_decl <- randomizr::declare_ra(N = nrow(dat), clust_var = dat$cl, prob = 0.5) dat$z <- randomizr::conduct_ra(clust_crs_decl)
# Regular SE using Young's inequality
horvitz_thompson(y ~ z, data = dat, declaration = clust_crs_decl)#>   coefficient_name        est        se         p   ci_lower  ci_upper df
#> 1                z 0.02766919 0.2671404 0.9200557 -0.5883577 0.6436961  8# SE using collapsed cluster totals and Young's inequality
horvitz_thompson(y ~ z, data = dat, declaration = clust_crs_decl, collapsed = TRUE)#>   coefficient_name        est        se         p   ci_lower  ci_upper df
#> 1                z 0.02766919 0.2311512 0.9076709 -0.5053664 0.5607048  8