Data wrangling
add operator
Description
add
operator adds two sets of data points.
Usage
Input projection | . |
---|---|
y-axis |
is the initial value before addition |
x-axis |
is the value to be added to the initial value |
Output relations | . |
---|---|
sum |
numeric, sum value returned per cell |
Details
The operator takes the value defined by the y-axis and adds the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.
GitHub link
asinh operator
Description
asinh
operator performs an inverse hyperbolic sine on values
Usage
Input projection | . |
---|---|
y-axis |
values required to be transformed by the asinh operator |
row |
channels, and scale values (optional) |
col |
event, for example |
Input parameters | . |
---|---|
scale |
numeric, the scaling factor to use before the asinh transformation, a NULL value indicates different scaling values per channel, default is 5 |
Output relations | . |
---|---|
asinh |
numeric, output transformation per row and col . |
Details
Values are scaled first and then asinh transformation is performed. One data point per cell is required as input.
A scale of 5 is recommended for cytof measurement and 150 for flowcyto measurements.
If a NULL scaling value is used then the scaling factor is give by the input cross-tab projection, the second factor defined in the row
dimension indicates the scaling values to use per channel.
References
See the base::asinh
function of the R package for more information.
GitHub link
Base change operator
Description
basechange
operator performs a basechange calculation of a base value compared to a given data point.
Usage
Input projection | . |
---|---|
y-axis ,layer1 |
data point to be compared with base |
y-axis ,layer2 |
base value |
Input parameters | . |
---|---|
percentage |
logical, indicates if the basechange is retuend as a percentage |
Output relations | . |
---|---|
basechange |
numeric, basechange value, per cell |
Details
basechange
operator performs a basechange calculation of a base value compared to a given data point.
GitHub link
calcnormfactors operator
Description
calcnormfactors
operator calculates a normalization factor per column (i.e sample).
Usage
Input projection | . |
---|---|
row |
represents the genes |
col |
represents the samples |
y-axis |
is the input data for which to compute the normalization factor |
Output relations | . |
---|---|
normfactor |
numeric, per column (i.e. per sample) |
Details
The operator is the calcnormfactor in the edgeR bioconductor.
References
see the edgeR
R package for the documentation.
See Also
Examples
GitHub link
Downsample operator
Description
The Downsample
allows one to downsample the data.
Usage
Input projection | . |
---|---|
column |
numeric, input data, per cell |
colour |
factor, groups |
Output relations | . |
---|---|
label |
character, pass or fail |
Parameters | . |
---|---|
seed |
random seed |
Details
A random subset of each group is assigned a pass
value: the size of the subset is the same as the smallest group. This method is typically used to handle unbalanced sample sizes between group.
GitHub link
impute operator
Description
impute
operator replaces NA with zero.
Usage
Input projection | . |
---|---|
y-axis |
is the input data for the imputation, per cell |
Output relations | . |
---|---|
impute |
numeric, a copy of the input data with NA replaces with zeros |
Details
The operator takes all the values of a cell and replaces it with the mean and if all values are NA then it is replaced with zero. The imputation is done per cell. There is one value calculated and returned for each of the input cell.
GitHub link
log operator
Description
log
operator performs a log of values
Usage
Input projection | . |
---|---|
y-axis |
values to be logged |
Input parameters | . |
---|---|
base |
numeric, log base, default is 10 |
Output relations | . |
---|---|
log |
numeric, log per data point |
Details
A log of values is performed.
References
See the base::log
function of the R package for the documentation.
GitHub link
normalyzer_operator
normalization methods from the NormalyzerDE package
GitHub link
orderBy operator
Description
Order data according to row- or column-wise metrics.
Usage
Input projection | . |
---|---|
row |
rows to order |
column |
columns to order |
y-axis |
numeric, input data, per cell |
Input parameters | . |
---|---|
decreasing |
boolean, whether the order is ascending (default) or descending |
function |
character, function to apply to each row/column |
Output relations | . |
---|---|
rorder |
numeric, row order (rank) |
corder |
numeric, column order (rank) |
GitHub link
product operator
Description
product
operator computes the product (i.e. multiplication) of a set of points.
Usage
Input projection | . |
---|---|
y-axis |
is the input data for the multiplication per cell |
Input parameters | . |
---|---|
remove NA |
boolean, default is FALSE |
Output relations | . |
---|---|
product |
numeric, product of the input values |
Details
The operator takes all the values of a cell and calculates their product. The computation is done per cell. There is one value calculated and returned for each of the input cell.
References
See Also
Examples
GitHub link
Random label operator
Description
The Random label
operator returns a random label per column.
Usage
Input projection | . |
---|---|
column |
input data, per cell |
Output relations | . |
---|---|
label |
numeric, random number between 0 and 100 |
Parameters | . |
---|---|
seed |
random seed |
Details
The Random label
operator can be used to randomly sample data. A number is sampled from a uniform distribution U(0, 100) and assigned to each column.
See Also
GitHub link
ratio operator
Description
ratio
operator computes the ratio of two data points.
Usage
Input projection | . |
---|---|
y-axis |
is the dividend of ratio |
x-axis |
is the divisor of ratio |
Output relations | . |
---|---|
ratio |
numeric, ratio value returned per cell |
Details
The operator takes the value defined by the y-axis and divides it by the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.
References
Examples
GitHub link
Replace operator
Description
The replace
operator performs the replacement of the first or all matches of a string.
Usage
Input projection | . |
---|---|
row |
character, strings to be replaced |
Properties | . |
---|---|
type |
character, replace the first or all matches (default: all) |
what |
character, pattern to be matched (regular expression) |
by |
character, a replacement for matched pattern |
Output relations | . |
---|---|
string |
character, string with replaced matches |
Details
The operator is based on the R functions gsub
and sub
.
See Also
GitHub link
Round operator
Description
Round
operator implements different number rounding methods.
Usage
Input projection | . |
---|---|
y-axis |
numeric, input data, per cell |
Output relations | . |
---|---|
rounded |
numeric, rounded input data |
GitHub link
Separate operator
Description
Separate
a character column into multiple columns with a regular expression or numeric locations.
Usage
Input projection | . |
---|---|
row |
single character column |
Output relations | . |
---|---|
col1 |
numeric or character |
col2 |
numeric or character |
etc.. |
multiple columns |
Details
Given a regular expression , Separate
turns a single character column into multiple columns.
References
This operator is a wrapper of the strsplit R function.
See Also
GitHub link
subtract operator
Description
subtract
operator computes the subtract of two data points.
Usage
Input projection | . |
---|---|
y-axis |
is the initial value before subtraction |
x-axis |
is the value to be subtracted from the initial value |
Output relations | . |
---|---|
difference |
numeric, difference value returned per cell |
Details
The operator takes the value defined by the y-axis and subtracts it by the value defined by the x-axis. The computation is done per cell. There is one value calculated and returned for each data point.
GitHub link
sum operator
Description
sum
operator computes the sum of a set of data points.
Usage
Input projection | . |
---|---|
y-axis |
is the input data for the sum, per cell |
Output relations | . |
---|---|
sum |
numeric, sum of the input values |
Details
The operator takes all the values of a cell and calculates their sum. The computation is done per cell. There is one value calculated and returned for each of the input cell.
References
See Also
Examples
GitHub link
vsn operator
Description
vsn
operator performs a normalization factor per column (i.e. sample).
Usage
Input projection | . |
---|---|
row |
represents the genes |
col |
represents the samples |
y-axis |
is the input data for which to normalization |
Output relations | . |
---|---|
norm |
numeric, the normalized values of the measurements (i.e. of each sample) |
Details
The operator is the justvsn
function from the VSN bioconductor package.
References
see the vsn
R package for the documentation.