Replication of Laniyonu & Goff (2021) — Police force vs SMI disparity — morie_laniyonu_smi_force

R port of morie.laniyonu.smi_force_disparity. Estimates a hierarchical negative-binomial model with a synthetic area-exposure (SAE) offset for persons-with-serious-mental-illness (PwSMI), with year fixed effects and an area random intercept.

Composes a synthetic-area-exposure (SAE) step (base-R logistic on survey microdata, predicted at ACS tract marginals) into a negative-binomial GLM with year fixed effects and an area random intercept approximated by ridge-penalised area dummies.

Usage

morie_laniyonu_smi_force_disparity(
  df,
  survey_df,
  survey_trait_col = "smi",
  survey_covariate_cols,
  area_covariate_cols = NULL,
  force_count_col = "force_events",
  non_smi_count_col = NULL,
  geog_col = "tract_id",
  year_col = "year",
  population_col = "pop_18plus",
  baseline_year = NULL,
  include_year_fe = TRUE,
  include_area_re = TRUE,
  max_iter = 500L,
  tol = 1e-06,
  return_design = FALSE
)

Arguments

df: Force-event panel, one row per (area, year).
survey_df: Survey microdata for fitting P(SMI | covariates).
survey_trait_col: Binary column in survey_df.
survey_covariate_cols: Covariates available in BOTH survey_df and df.
area_covariate_cols: Optional rename map for df.
force_count_col: Count of force events against PwSMI per (area, year).
non_smi_count_col: Count of force events against non-SMI per (area, year). If NULL, df must contain total_force_events and the non-SMI count is computed as total minus PwSMI.
geog_col, year_col, population_col: Column names.
baseline_year: Year to drop as the reference (default = min).
include_year_fe, include_area_re: Toggle the year FE / area RE blocks.
max_iter, tol: Optimiser controls.
return_design: Attach X, y, offset to exposure_summary for hand-off to brms / rstanarm.

Value

A list of class morie_laniyonu_smi_result.

A list of class morie_laniyonu_smi_result.

Details

The trick: there is no administrative census of who has SMI at the tract level, so the denominator is built by:

Fitting P(SMI | age, sex, race, income, ...) on a national survey using only covariates also tabulated at the tract level by the ACS.
Applying those coefficients to ACS tract marginals to get a per-tract predicted P(SMI).
Multiplying by adult population for a synthetic exposure denominator $n_{vti}$.

The count model is $$y_{vti} \sim \mathrm{NegBin}(n_{vti} \exp(\mu + \alpha_v + \delta_t + \beta_i), \phi)$$ with $v$ = PwSMI vs non-SMI, $t$ = year, $i$ = area. The headline coefficient $\alpha_v$ is the log relative-risk of police use of force against PwSMI vs non-SMI.

Paper headlines: RR PwSMI = 11.6x (tract); 10.2x (precinct).

This R port is a frequentist MLE approximation (via stats::glm.nb in MASS, falling back to a hand-rolled NB MLE on stats::optim if MASS is unavailable). For paper-grade Bayesian credible intervals, fit in brms / rstanarm using the design matrix returned with return_design=TRUE.

Surfaces a warning() on every call: the SMI flag on force events is a proxy biased TOWARD THE NULL (officers miss more SMI than they over-attribute), so the estimated $\alpha_v$ is a conservative lower bound on the true disparity.

References

Laniyonu, A., & Goff, P. A. (2021). Measuring disparities in police use of force and injury among persons with serious mental illness. BMC Psychiatry, 21(1), 500.