Compute the synthetic exposure offset for each area
Source:R/mrm_primitives_synthetic_exposure.R
mrm_synthetic_area_exposure.RdFit logistic \(P(\text{trait} | \text{covariates})\) on the survey microdata.
Apply fitted coefficients to area-level marginals from
area_df.Multiply predicted rate by area population to obtain the synthetic "population at risk" exposure offset.
Usage
mrm_synthetic_area_exposure(
survey_df,
survey_trait_col,
survey_covariate_cols,
area_df,
area_population_col,
fit_callable = NULL,
return_per_area_rate = FALSE
)Arguments
- survey_df
A
data.frameof survey microdata (one row per respondent), carryingsurvey_trait_col(0/1 or logical) andsurvey_covariate_cols.- survey_trait_col
Character. Name of the binary trait column.
- survey_covariate_cols
Character vector of covariates that are present in BOTH the survey and the area dataset.
- area_df
A
data.framewith one row per area (tract, precinct, etc.); must carry the same covariate columns as area-level proportions / means, plusarea_population_col.- area_population_col
Character. Adult-population column in
area_df.- fit_callable
Optional function with signature
function(X, y) -> coef, returning a coefficient vector of lengthlength(survey_covariate_cols) + 1L(intercept first). Defaults to a base-R Newton-IRLS logistic fit.- return_per_area_rate
Logical; default
FALSE. IfTRUEthe result list also carriespredicted_rate.
Value
A named list with classes morie_mrm_result,
morie_rich_result, list. Carries
exposure (named numeric vector, one entry per area row),
predicted_rate (when requested), coef (the fitted
logistic coefficient vector), plus interpretation +
warnings.
Examples
set.seed(2)
n_survey <- 500
x1 <- rnorm(n_survey); x2 <- rnorm(n_survey)
p <- 1 / (1 + exp(-(-2 + 0.6 * x1 - 0.4 * x2)))
y <- rbinom(n_survey, 1, p)
survey <- data.frame(trait = y, x1 = x1, x2 = x2)
area <- data.frame(
x1 = rnorm(20), x2 = rnorm(20),
pop = sample(800:1500, 20, replace = TRUE)
)
rownames(area) <- paste0("area_", seq_len(20))
res <- mrm_synthetic_area_exposure(
survey_df = survey,
survey_trait_col = "trait",
survey_covariate_cols = c("x1", "x2"),
area_df = area,
area_population_col = "pop"
)
head(res$exposure)
#> area_1 area_2 area_3 area_4 area_5 area_6
#> 81.02290 45.79443 32.31382 513.26164 59.79157 51.19841