Synthetic small-area-estimated exposure offset (MRM primitive)

Mirrors the Python module morie.mrm_primitives.synthetic_exposure, adapted from Laniyonu & Goff (2021) BMC Psychiatry 21(1):500.

Details

The trick: when you need a rate-per-hidden-subpopulation (force- per-PwSMI, contact-per-undocumented, contact-per-homeless) and no administrative census of that subpopulation exists, you can:

Fit \(P(\text{trait} | \text{covariates})\) on a national probability sample (NCS-R for SMI; ACS-style survey for other traits) using ONLY covariates also available at the area level.
Apply the fitted coefficients to area-level marginals from ACS / census to predict \(P(\text{trait})\) per area.
Multiply by area-level adult population to get a synthetic "population at risk" denominator.

Generalises far beyond Laniyonu & Goff's SMI application: homelessness rates of police force, LGBTQ stop-and-frisk rates, undocumented-immigrant ICE-contact rates – any "rate per hidden subpopulation" estimand.

The returned offset is suitable for use as the offset= log(exposure) argument in a Poisson / negative-binomial GLM that counts trait-specific events.