Stores information necessary to simulate and visualize datasets based
on underlying distribution Z
.
Arguments
- generator
Function which generates data from the underlying base distribution. It is assumed it takes the number of simulated observations
n_obs
as first argument, as all random generation functions in the stats and extraDistr do. Furthermore, it is expected to return a two-dimensional array as output (matrix or data.frame). Alternatively an R object derived from thesimdata::simdesign
class. See details.- transform_initial
Function which specifies the transformation of the underlying dataset
Z
to final datasetX
. See details.- n_var_final
Integer, number of columns in final datamatrix
X
. Can be inferred whencheck_and_infer
is TRUE.- types_final
Optional vector of length equal to
n_var_final
(set by the user or inferred) and hence number of columns of final datasetX
. Allowed entries are "logical", "factor" and "numeric". Stores the type of the columns ofX
. If not specified by, inferred ifcheck_and_infer
is set to TRUE.- names_final
NULL or character vector with variable names for final dataset
X
. Length needs to equal the number of columns ofX
. Overrides other naming options. See details.- prefix_final
NULL or prefix attached to variables in final dataset
X
. Overriden bynames_final
argument. Set to NULL if no prefixes should be added. See details.- process_final
List of lists specifying post-processing functions applied to final datamatrix
X
before returning it. Seedo_processing
.- name
Character, optional name of the simulation design.
- check_and_infer
If TRUE, then the simulation design is tested by simulating 5 observations using
simulate_data
. If everything works without error, the variablesn_var_final
andtypes_final
will be inferred from the results if not already set correctly by the user.- ...
Further arguments are directly stored in the list object to be passed to
simulate_data
.
Value
List object with class attribute "simdesign" (S3 class) containing the following entries (if no further information given, entries are directly saved from user input):
generator
name
transform_initial
n_var_final
types_final
names_final
process_final
entries for further information as passed by the user
Details
The simdesign
class should be used in the following workflow:
Specify a design template which will be used in subsequent data generating / visualization steps.
Sample / visualize datamatrix following template (possibly multiple times) using
simulate_data
.Use sampled datamatrix for simulation study.
For more details on generators and transformations, please see the
documentation of simulate_data
.
For details on post-processing, please see the documentation of
do_processing
.
Naming of variables
If check_and_infer
is set to TRUE, the following procedure determines
the names of the variables:
use
names_final
if specified and of correct lengthotherwise, use the names of
transform_initial
if present and of correct lengthotherwise, use
prefix_final
to prefix the variable number if not NULLotherwise, use names from dataset as generated by the
generator
function
Simulation Templates
This class is intended to be used as a template for simulation designs
which are based on specific underlying distributions. All such a template
needs to define is the generator
function and its construction and
pass it to this function along with the other arguments. See
simdesign_mvtnorm
for an example.
Examples
generator <- function(n) mvtnorm::rmvnorm(n, mean = 0)
sim_design <- simdesign(generator)
simulate_data(sim_design, 10, seed = 19)
#> v1
#> [1,] -1.1894537
#> [2,] 0.3885812
#> [3,] -0.3443333
#> [4,] -0.5478961
#> [5,] 0.9806622
#> [6,] -0.2366460
#> [7,] 0.8097397
#> [8,] -0.7447795
#> [9,] -0.2597870
#> [10,] -0.1830838