Build the cross-portal dataset catalog
Source:R/dataset_portal_catalog.R
morie_dataset_portal_catalog.RdAggregates every per-portal registry (Chicago, NYC NYPD, NYC
OpenData, TPS ArcGIS Hub, TPS PSDP, Ontario CKAN, Vancouver,
VPD GeoDASH, Statistics Canada CCJS, Montreal, Toronto, Calgary,
Edmonton, Ottawa) into a single tidy data.frame for cross-portal
discovery and tooling. Caches the result in a session-local
environment so repeated calls are O(1); call
morie_dataset_portal_catalog_clear_cache() to force a rebuild
after editing a registry in an interactive session.
Arguments
- portal
Optional character filter restricting output to a single portal.
NULL(default) returns every dataset across all registries. Bulk portals (NYC OpenData, Chicago, Toronto Hub, etc.) prefer thermoriedatacompanion when installed, otherwise contribute zero rows with a one-time warning per portal. A per-portal call works withoutrmoriedatafor portals whose registry lives in code ("nyc_nypd","tps_psdp","ontario_ckan","statcan_ccjs", etc.); portals served byrmoriedata("nyc_opendata","chicago","tps_arcgis_hub","vancouver_opendata", etc.) return zero rows with a one-time warning when the companion is absent.
Value
A data.frame with columns dataset_key, source,
id, api_modes, loader, dict_url, n_rows_bundled.
Examples
# Per-portal slice: registry lives in code, fastest path.
nypd <- morie_dataset_portal_catalog(portal = "nyc_nypd")
nrow(nypd)
#> [1] 8
head(nypd$dataset_key)
#> [1] "nypd_arrests_historic" "nypd_arrests_ytd"
#> [3] "nypd_complaint_historic" "nypd_complaint_ytd"
#> [5] "nypd_hate_crimes" "nypd_uof_incidents"
# Full catalog: bulk portals (NYC OpenData, Chicago, Toronto Hub,
# etc.) prefer the rmoriedata companion when installed, otherwise
# contribute zero rows with a one-time warning per portal.
cat_df <- morie_dataset_portal_catalog()
table(cat_df$source)
#>
#> calgary_opendata chicago edmonton_opendata montreal_opendata
#> 933 1864 2027 401
#> nyc_nypd nyc_opendata ontario_ckan ottawa_opendata
#> 8 2861 38 287
#> statcan_ccjs toronto_opendata tps_arcgis_hub tps_psdp
#> 10 540 71 11
#> vancouver_opendata vpd_geodash
#> 190 1