Truncate columns of datamatrix at datamatrix specific thresholds
Source:R/post_processing.R
process_truncate_by_iqr.RdTruncation based on the interquartile range to be applied to a dataset.
Arguments
- x
Matrix or Data.frame.
- truncate_multipliers
Vector of truncation parameters. Either a single value which is replicated as necessary or of same dimension as
ncol(x). If any vector entry is NA, the corresponding column will not be truncated. If named, then the names must correspond to columnnames inx, and only specified columns will be processed. See details.- only_numeric
If TRUE and if
xis a data.frame, then only columns of typenumericwill be processed. Otherwise all columns will be processed (e.g. also in the case thatxis a matrix).
Details
Truncation is processed as follows:
Compute the 1st and 3rd quartile q1 / q3 of variables in
x.Multiply these quantities by values in
truncate_multipliersto obtain L and U. If a value is NA, the corresponding variable is not truncated.Set any value smaller / larger than L / U to L / U.
Truncation multipliers can be specified in three ways (note that whenever
only_numeric is set to TRUE, then only numeric columns are affected):
A single numeric - then all columns will be processed in the same way
A numeric vector without names - it is assumed that the length can be replicated to the number of columns in
x, each column is processed by the corresponding value in the vectorA numeric vector with names - length can differ from the columns in
xand only the columns for which the names occur in the vector are processed