Stores information necessary to simulate and visualize datasets based
on underlying distribution Z.
Arguments
- generator
Function which generates data from the underlying base distribution. It is assumed it takes the number of simulated observations
n_obsas first argument, as all random generation functions in the stats and extraDistr do. Furthermore, it is expected to return a two-dimensional array as output (matrix or data.frame). Alternatively an R object derived from thesimdata::simdesignclass. See details.- transform_initial
Function which specifies the transformation of the underlying dataset
Zto final datasetX. See details.- n_var_final
Integer, number of columns in final datamatrix
X. Can be inferred whencheck_and_inferis TRUE.- types_final
Optional vector of length equal to
n_var_final(set by the user or inferred) and hence number of columns of final datasetX. Allowed entries are "logical", "factor" and "numeric". Stores the type of the columns ofX. If not specified by, inferred ifcheck_and_inferis set to TRUE.- names_final
NULL or character vector with variable names for final dataset
X. Length needs to equal the number of columns ofX. Overrides other naming options. See details.- prefix_final
NULL or prefix attached to variables in final dataset
X. Overriden bynames_finalargument. Set to NULL if no prefixes should be added. See details.- process_final
List of lists specifying post-processing functions applied to final datamatrix
Xbefore returning it. Seedo_processing.- name
Character, optional name of the simulation design.
- check_and_infer
If TRUE, then the simulation design is tested by simulating 5 observations using
simulate_data. If everything works without error, the variablesn_var_finalandtypes_finalwill be inferred from the results if not already set correctly by the user.- ...
Further arguments are directly stored in the list object to be passed to
simulate_data.
Value
List object with class attribute "simdesign" (S3 class) containing the following entries (if no further information given, entries are directly saved from user input):
generatornametransform_initialn_var_finaltypes_finalnames_finalprocess_finalentries for further information as passed by the user
Details
The simdesign class should be used in the following workflow:
Specify a design template which will be used in subsequent data generating / visualization steps.
Sample / visualize datamatrix following template (possibly multiple times) using
simulate_data.Use sampled datamatrix for simulation study.
For more details on generators and transformations, please see the
documentation of simulate_data.
For details on post-processing, please see the documentation of
do_processing.
Naming of variables
If check_and_infer is set to TRUE, the following procedure determines
the names of the variables:
use
names_finalif specified and of correct lengthotherwise, use the names of
transform_initialif present and of correct lengthotherwise, use
prefix_finalto prefix the variable number if not NULLotherwise, use names from dataset as generated by the
generatorfunction
Simulation Templates
This class is intended to be used as a template for simulation designs
which are based on specific underlying distributions. All such a template
needs to define is the generator function and its construction and
pass it to this function along with the other arguments. See
simdesign_mvtnorm for an example.
Examples
generator <- function(n) mvtnorm::rmvnorm(n, mean = 0)
sim_design <- simdesign(generator)
simulate_data(sim_design, 10, seed = 19)
#> v1
#> [1,] -1.1894537
#> [2,] 0.3885812
#> [3,] -0.3443333
#> [4,] -0.5478961
#> [5,] 0.9806622
#> [6,] -0.2366460
#> [7,] 0.8097397
#> [8,] -0.7447795
#> [9,] -0.2597870
#> [10,] -0.1830838