Synthetic predictive-policing dataset with a known disparity
Source:R/fairness_simulation.R
morie_fairness_simulate_biased_crime_data.RdGenerates per-record data with area, group,
true_outcome (group-independent Bernoulli at base_rate),
detected (group-dependent), and risk_score (0–500,
shifted up by bias * 100 points for non-reference groups).
The bias input is the ground truth the audits should recover.
Usage
morie_fairness_simulate_biased_crime_data(
n = 2000L,
groups = c("A", "B"),
group_props = NULL,
n_areas = 20L,
base_rate = 0.3,
bias = 0.5,
seed = 0L
)Arguments
- n
Number of records.
- groups
Character vector of group labels (the first entry is treated as the reference group).
- group_props
Optional sampling proportions.
- n_areas
Number of areas (>= number of groups).
- base_rate
Reference-group favourable-outcome rate in 0–1.
- bias
Injected disparity in -1–1.
- seed
Reproducibility seed.