Replication of Laniyonu & Goff (2021) — Police force vs SMI disparity
Source:R/laniyonu_smi_force_disparity.R
morie_laniyonu_smi_force_disparity.RdR port of morie.laniyonu.smi_force_disparity. Estimates a
hierarchical negative-binomial model with a synthetic area-exposure
(SAE) offset for persons-with-serious-mental-illness (PwSMI), with
year fixed effects and an area random intercept.
Composes a synthetic-area-exposure (SAE) step (base-R logistic on survey microdata, predicted at ACS tract marginals) into a negative-binomial GLM with year fixed effects and an area random intercept approximated by ridge-penalised area dummies.
Usage
morie_laniyonu_smi_force_disparity(
df,
survey_df,
survey_trait_col = "smi",
survey_covariate_cols,
area_covariate_cols = NULL,
force_count_col = "force_events",
non_smi_count_col = NULL,
geog_col = "tract_id",
year_col = "year",
population_col = "pop_18plus",
baseline_year = NULL,
include_year_fe = TRUE,
include_area_re = TRUE,
max_iter = 500L,
tol = 1e-06,
return_design = FALSE
)Arguments
- df
Force-event panel, one row per (area, year).
- survey_df
Survey microdata for fitting P(SMI | covariates).
- survey_trait_col
Binary column in
survey_df.- survey_covariate_cols
Covariates available in BOTH survey_df and df.
- area_covariate_cols
Optional rename map for df.
- force_count_col
Count of force events against PwSMI per (area, year).
- non_smi_count_col
Count of force events against non-SMI per (area, year). If
NULL, df must containtotal_force_eventsand the non-SMI count is computed as total minus PwSMI.- geog_col, year_col, population_col
Column names.
- baseline_year
Year to drop as the reference (default = min).
- include_year_fe, include_area_re
Toggle the year FE / area RE blocks.
- max_iter, tol
Optimiser controls.
- return_design
Attach
X,y,offsettoexposure_summaryfor hand-off to brms / rstanarm.
Details
The trick: there is no administrative census of who has SMI at the tract level, so the denominator is built by:
Fitting P(SMI | age, sex, race, income, ...) on a national survey using only covariates also tabulated at the tract level by the ACS.
Applying those coefficients to ACS tract marginals to get a per-tract predicted P(SMI).
Multiplying by adult population for a synthetic exposure denominator \(n_{vti}\).
The count model is $$y_{vti} \sim \mathrm{NegBin}(n_{vti} \exp(\mu + \alpha_v + \delta_t + \beta_i), \phi)$$ with \(v\) = PwSMI vs non-SMI, \(t\) = year, \(i\) = area. The headline coefficient \(\alpha_v\) is the log relative-risk of police use of force against PwSMI vs non-SMI.
Paper headlines: RR PwSMI = 11.6x (tract); 10.2x (precinct).
This R port is a frequentist MLE approximation (via
stats::glm.nb in MASS, falling back to a hand-rolled NB MLE
on stats::optim if MASS is unavailable). For paper-grade
Bayesian credible intervals, fit in brms / rstanarm
using the design matrix returned with return_design=TRUE.
Surfaces a warning() on every call: the SMI flag on force
events is a proxy biased TOWARD THE NULL (officers miss more SMI
than they over-attribute), so the estimated \(\alpha_v\) is a
conservative lower bound on the true disparity.