Generate synthetic epidemiology-style tabular data
Source:R/synthetic.R
morie_generate_synthetic_data.RdGenerates non-identifying synthetic data suitable for development, testing,
and demos. The generator uses a canonical variable set and allows output
column renaming through name_map so it can be adapted to multiple studies.
Synthetic data should not be used for final inferential reporting.
Usage
morie_generate_synthetic_data(
n = 5000L,
seed = 42L,
special_code_rate = 0.02,
profile = c("generic", "morie_legacy"),
name_map = NULL
)Arguments
- n
Number of rows.
- seed
Random seed for reproducibility.
- special_code_rate
Proportion of values replaced with survey-style special missing codes (
97/98/99/997/998/999) in discrete fields.- profile
Convenience profile for output naming; ignored when
name_mapis supplied.- name_map
Optional named character vector mapping canonical keys to output column names. Use
morie_default_synthetic_name_map()as a template.
Examples
df <- morie_generate_synthetic_data(n = 200, seed = 1)
nrow(df)
#> [1] 200