Skip to contents

Generates per-record data with area, group, true_outcome (group-independent Bernoulli at base_rate), detected (group-dependent), and risk_score (0–500, shifted up by bias * 100 points for non-reference groups). The bias input is the ground truth the audits should recover.

Usage

morie_fairness_simulate_biased_crime_data(
  n = 2000L,
  groups = c("A", "B"),
  group_props = NULL,
  n_areas = 20L,
  base_rate = 0.3,
  bias = 0.5,
  seed = 0L
)

Arguments

n

Number of records.

groups

Character vector of group labels (the first entry is treated as the reference group).

group_props

Optional sampling proportions.

n_areas

Number of areas (>= number of groups).

base_rate

Reference-group favourable-outcome rate in 0–1.

bias

Injected disparity in -1–1.

seed

Reproducibility seed.

Value

A data.frame with columns area, group, true_outcome, detected, risk_score.