Rebuild the Ontario SIU DRID manifest by probing the live site
Source:R/siu.R
morie_siu_refresh_manifest.RdSweeps director's-report ids 1..max_drid and writes a small
CSV recording which ids return a healthy report page, the parsed
case number, and the response body size. The harvester
(morie_fetch_siu) then uses this manifest to short-circuit
the ~30-50 percent of ids that have no report, saving bandwidth and
WAF-trigger risk on every run.
Usage
morie_siu_refresh_manifest(
out_path = NULL,
max_drid = NULL,
min_drid = 1L,
concurrency = 4L,
rate_rps = 4,
progress = TRUE
)Arguments
- out_path
Path to write the gzipped CSV. Default is the in-place manifest location (only useful for maintainers building from a source checkout).
- max_drid
Highest drid to probe. Default
NULLauto-discovers from the SIU index endpoint and adds a margin.- min_drid
Lowest drid to probe (default
1L).- concurrency
Maximum simultaneous transfers (default
4).- rate_rps
Maximum request starts per second (default
4).- progress
Logical; print a per-batch progress line.
Value
Invisibly, a data frame of the full sweep (every probed drid,
including misses), parallel to what was written to out_path.