Data wrangling

add operator

Description

add operator adds two sets of data points.

Usage
Input projection .
y-axis is the initial value before addition
x-axis is the value to be added to the initial value
Output relations .
sum numeric, sum value returned per cell
Details

The operator takes the value defined by the y-axis and adds the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.

See Also

substract, sum, product, ratio

asinh operator

Description

asinh operator performs an inverse hyperbolic sine on values

Usage
Input projection .
y-axis values required to be transformed by the asinh operator
row channels, and scale values (optional)
col event, for example
Input parameters .
scale numeric, the scaling factor to use before the asinh transformation, a NULL value indicates different scaling values per channel, default is 5
Output relations .
asinh numeric, output transformation per row and col.
Details

Values are scaled first and then asinh transformation is performed. One data point per cell is required as input.

asinh(value/scale)

A scale of 5 is recommended for cytof measurement and 150 for flowcyto measurements.

If a NULL scaling value is used then the scaling factor is give by the input cross-tab projection, the second factor defined in the row dimension indicates the scaling values to use per channel.

References

See the base::asinh function of the R package for more information.

Base change operator

Description

basechange operator performs a basechange calculation of a base value compared to a given data point.

Usage
Input projection .
y-axis,layer1 data point to be compared with base
y-axis,layer2 base value
Input parameters .
percentage logical, indicates if the basechange is retuend as a percentage
Output relations .
basechange numeric, basechange value, per cell
Details

basechange operator performs a basechange calculation of a base value compared to a given data point.

calcnormfactors operator

Description

calcnormfactors operator calculates a normalization factor per column (i.e sample).

Usage
Input projection .
row represents the genes
col represents the samples
y-axis is the input data for which to compute the normalization factor
Output relations .
normfactor numeric, per column (i.e. per sample)
Details

The operator is the calcnormfactor in the edgeR bioconductor.

References

see the edgeR R package for the documentation.

See Also
Examples

Downsample operator

Description

The Downsample allows one to downsample the data.

Usage
Input projection .
column numeric, input data, per cell
colour factor, groups
Output relations .
label character, pass or fail
Parameters .
seed random seed
Details

A random subset of each group is assigned a pass value: the size of the subset is the same as the smallest group. This method is typically used to handle unbalanced sample sizes between group.

impute operator

Description

impute operator replaces NA with zero.

Usage
Input projection .
y-axis is the input data for the imputation, per cell
Output relations .
impute numeric, a copy of the input data with NA replaces with zeros
Details

The operator takes all the values of a cell and replaces it with the mean and if all values are NA then it is replaced with zero. The imputation is done per cell. There is one value calculated and returned for each of the input cell.

log operator

Description

log operator performs a log of values

Usage
Input projection .
y-axis values to be logged
Input parameters .
base numeric, log base, default is 10
Output relations .
log numeric, log per data point
Details

A log of values is performed.

References

See the base::log function of the R package for the documentation.

normalyzer_operator

normalization methods from the NormalyzerDE package

orderBy operator

Description

Order data according to row- or column-wise metrics.

Usage
Input projection .
row rows to order
column columns to order
y-axis numeric, input data, per cell
Input parameters .
decreasing boolean, whether the order is ascending (default) or descending
function character, function to apply to each row/column
Output relations .
rorder numeric, row order (rank)
corder numeric, column order (rank)

product operator

Description

product operator computes the product (i.e. multiplication) of a set of points.

Usage
Input projection .
y-axis is the input data for the multiplication per cell
Input parameters .
remove NA boolean, default is FALSE
Output relations .
product numeric, product of the input values
Details

The operator takes all the values of a cell and calculates their product. The computation is done per cell. There is one value calculated and returned for each of the input cell.

References
Examples

Random label operator

Description

The Random label operator returns a random label per column.

Usage
Input projection .
column input data, per cell
Output relations .
label numeric, random number between 0 and 100
Parameters .
seed random seed
Details

The Random label operator can be used to randomly sample data. A number is sampled from a uniform distribution U(0, 100) and assigned to each column.

ratio operator

Description

ratio operator computes the ratio of two data points.

Usage
Input projection .
y-axis is the dividend of ratio
x-axis is the divisor of ratio
Output relations .
ratio numeric, ratio value returned per cell
Details

The operator takes the value defined by the y-axis and divides it by the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.

References
See Also

sum, product

Examples

Replace operator

Description

The replace operator performs the replacement of the first or all matches of a string.

Usage
Input projection .
row character, strings to be replaced
Properties .
type character, replace the first or all matches (default: all)
what character, pattern to be matched (regular expression)
by character, a replacement for matched pattern
Output relations .
string character, string with replaced matches
Details

The operator is based on the R functions gsub and sub.

Round operator

Description

Round operator implements different number rounding methods.

Usage
Input projection .
y-axis numeric, input data, per cell
Output relations .
rounded numeric, rounded input data

Separate operator

Description

Separate a character column into multiple columns with a regular expression or numeric locations.

Usage
Input projection .
row single character column
Output relations .
col1 numeric or character
col2 numeric or character
etc.. multiple columns
Details

Given a regular expression , Separate turns a single character column into multiple columns.

References

This operator is a wrapper of the strsplit R function.

See Also

replace_operator

subtract operator

Description

subtract operator computes the subtract of two data points.

Usage
Input projection .
y-axis is the initial value before subtraction
x-axis is the value to be subtracted from the initial value
Output relations .
difference numeric, difference value returned per cell
Details

The operator takes the value defined by the y-axis and subtracts it by the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.

See Also

sum, product, ratio

sum operator

Description

sum operator computes the sum of a set of data points.

Usage
Input projection .
y-axis is the input data for the sum, per cell
Output relations .
sum numeric, sum of the input values
Details

The operator takes all the values of a cell and calculates their sum. The computation is done per cell. There is one value calculated and returned for each of the input cell.

References
Examples

vsn operator

Description

vsn operator performs a normalization factor per column (i.e. sample).

Usage
Input projection .
row represents the genes
col represents the samples
y-axis is the input data for which to normalization
Output relations .
norm numeric, the normalized values of the measurements (i.e. of each sample)
Details

The operator is the justvsn function from the VSN bioconductor package.

References

see the vsn R package for the documentation.

See Also
Examples