Skip to contents

Runs arbitrary SQL against BigQuery via bigrquery, downloads the full result set, and returns it as a base R data.frame. Authentication uses Application Default Credentials (the same flow the rest of the HADES-LLM stack uses); to authenticate interactively, run bigrquery::bq_auth() first.

Usage

morie_ingest_bigquery_query(
  sql,
  billing_project = NULL,
  page_size = 10000L,
  max_rows = Inf,
  quiet = TRUE
)

Arguments

sql

A SQL string to execute.

billing_project

Project to bill the query to. NULL falls back to the GCP_PROJECT env var, then to ADC.

page_size

Rows per download page (forwarded to bq_table_download).

max_rows

Optional cap on rows downloaded (defaults to Inf, i.e. all rows).

quiet

Suppress bigrquery progress output.

Value

A base R data.frame.

Details

Billing project is resolved from billing_project, then the GCP_PROJECT environment variable, then ADC discovery; if none of those yields a project the call errors out with a clear message before contacting BigQuery.

Examples

if (FALSE) { # \dontrun{
# Requires the 'bigrquery' package, ADC, and a billing project.
Sys.setenv(GCP_PROJECT = "my-billing-project")
df <- morie_ingest_bigquery_query(
  "SELECT year, COUNT(*) AS n
     FROM `bigquery-public-data.chicago_crime.crime`
    GROUP BY year
    ORDER BY year"
)
head(df)
} # }