Truncate columns of datamatrix at datamatrix specific thresholds
Source:R/post_processing.R
process_truncate_by_iqr.Rd
Truncation based on the interquartile range to be applied to a dataset.
Arguments
- x
Matrix or Data.frame.
- truncate_multipliers
Vector of truncation parameters. Either a single value which is replicated as necessary or of same dimension as
ncol(x)
. If any vector entry is NA, the corresponding column will not be truncated. If named, then the names must correspond to columnnames inx
, and only specified columns will be processed. See details.- only_numeric
If TRUE and if
x
is a data.frame, then only columns of typenumeric
will be processed. Otherwise all columns will be processed (e.g. also in the case thatx
is a matrix).
Details
Truncation is processed as follows:
Compute the 1st and 3rd quartile q1 / q3 of variables in
x
.Multiply these quantities by values in
truncate_multipliers
to obtain L and U. If a value is NA, the corresponding variable is not truncated.Set any value smaller / larger than L / U to L / U.
Truncation multipliers can be specified in three ways (note that whenever
only_numeric
is set to TRUE, then only numeric columns are affected):
A single numeric - then all columns will be processed in the same way
A numeric vector without names - it is assumed that the length can be replicated to the number of columns in
x
, each column is processed by the corresponding value in the vectorA numeric vector with names - length can differ from the columns in
x
and only the columns for which the names occur in the vector are processed