Skip to contents

Returns the shipped drid manifest as a data frame – one row per director's-report id morie has verified, with the parsed case number, detected language, and the canonical drid (the English drid for that case, or the first drid if no English version exists). This is the index morie_fetch_siu() uses internally; exposing it lets users:

Usage

morie_siu_index(lang = c("all", "en", "fr", "valid"), canonical_only = FALSE)

Arguments

lang

Filter rows by detected language. "all" (default) returns every entry; "en" returns only the English drids; "fr" returns only French; "valid" returns every drid whose case_number was successfully parsed (drops blank / draft drids).

canonical_only

If TRUE, returns one row per case_number (the canonical drid for that case, English preferred). Useful when you want a unique-cases index.

Value

A data frame with columns drid, http_code, body_bytes, attempts, case_number, _language, source, retrieved_at_utc, canonical_drid.

Details

  • see exactly which drids ship as known-valid (no need to fetch to find out);

  • subset to English-only or French-only case lists without running the full harvester;

  • map between drid (URL fragment) and case_number (SIU's own identifier) offline.

The manifest is refreshed by maintainers via morie_siu_refresh_manifest(); it ships gzipped under inst/extdata/ at ~50 KB.

Examples

idx <- morie_siu_index(lang = "en")
head(idx)
#>   drid http_code body_bytes attempts case_number    source     retrieved_at_utc
#> 1   46       200      85500        1  17-OVI-201 siu.on.ca 2026-05-20T21:32:15Z
#> 2   48       200     108612        1  17-TVI-108 siu.on.ca 2026-05-20T21:32:15Z
#> 3   66       200      85075        1  17-TCI-209 siu.on.ca 2026-05-20T21:32:15Z
#> 4   68       200      71364        1  17-OCI-220 siu.on.ca 2026-05-20T21:32:15Z
#> 5   70       200      66595        1  17-TCI-228 siu.on.ca 2026-05-20T21:32:15Z
#> 6   72       200      80566        1  17-OCI-243 siu.on.ca 2026-05-20T21:32:15Z
#>   _language canonical_drid
#> 1        en             46
#> 2        en             48
#> 3        en             66
#> 4        en             68
#> 5        en             70
#> 6        en             72
# How many drids are English vs French vs unknown?
table(morie_siu_index()$`_language`)
#> 
#>   en   fr 
#> 2531 2212 
# Unique-case index (English-preferred)
canon <- morie_siu_index(canonical_only = TRUE)
nrow(canon)
#> [1] 2218