Group-disparity metrics for auditing classification and risk systems

R parity for the Python morie.fairness.metrics module. Every callable here is an audit measure: given the decisions a system made (and, where available, the realised ground truth) plus a protected attribute such as race, it quantifies whether outcomes differ across groups. None of these functions make predictions; they only measure disparity in predictions that already exist.

Value

Each callable in this module returns a named list with the metric value, a per-group breakdown, advisory warnings, and a plain-language interpretation.

Details

Functions:

morie_fairness_disparate_impact(): the EEOC four-fifths rule.
morie_fairness_demographic_parity(): favourable-rate gap.
morie_fairness_equalized_odds(): TPR/FPR gaps (needs ground truth).
morie_fairness_average_odds_difference(): mean TPR+FPR gap.
morie_fairness_gini(): concentration of a score distribution.
morie_fairness_bias_amplification(): composite Delta_parity * G.

Each returns a named list with the metric value, a per-group breakdown, any advisory warnings, and a plain-language interpretation, mirroring the payload of the Python RichResult.

Prior art reimplemented independently (no code copied): IBM AI Fairness 360 metric definitions; the COMPAS audit in pbiecek's XAI Stories; the SciencesPo Predictive-policing-Chicago project (Lacherade, Szabo, Krikava & Aeby, 2021); and Barman & Barman, arXiv:2603.18987 (the Bias Amplification Score).

Examples

pred <- c(1, 1, 1, 1, 1, 1, 1, 1, 0, 0)
race <- c(rep("A", 5), rep("B", 5))
morie_fairness_disparate_impact(pred, race, privileged = "A")$value
#> [1] 0.6