Dimensionality reduction and clustering

Approximated t-SNE

Description
https://github.com/tercen/atsne_docker_operator.git
Build
VERSION=1.1.11
docker build -t tercen/atsne:$VERSION .
docker push tercen/atsne:$VERSION
# see operator.json :-- "container": "tercen/atsne:1.1.8"
git add -A && git commit -m "$VERSION" && git tag -a $VERSION -m "++" && git push && git push :--tags
# renv cache ~/.local/share/renv
docker run -it :--rm :--entrypoint "/bin/bash" tercen/atsne:1.1.11 

Clustering metrics operator

Description

clustering_metrics operator returns clustering metrics.

Usage
Input projection .
row represents the variables
col represents the observations
label represents the clusters
y-axis is the value of the measurement
Output relations .
metrics character, name of the clustering metric
value numeric, value of the clustering metric
Details
References

This operator is based on the clusterCrit R function.

Hierarchical clustering tree operator

Description

clustering_tree operator returns a hierarchical clustering tree to be projected in Tercen.

Usage
Input projection .
row factor, variables to cluster
col factor, variables to cluster (dist_to variable from a dist operator)
y-axis numeric, pairwise distance (dist variable from a dist operator)
Output relations .
presence numeric, to be projected on y-axis
tree_dim1 factor, to be projected on rows
tree_dim2 factor, to be projected on columns
tip_labels factor, leaf labels, to be projected on rows

clusterx operator

Description

clusterx operator performs a fast clustering by automatic search and find of density peaks

Usage
Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells, samples, individuals)
x-axis first axis
y-axis second axis
Input parameters .
dimReduction type of reduction to perform, pca, tsne, NULL, default is NULL
outDim number of demensions to return, default 2
Output relations .
cluster character, returns a cluster id per value, per cell
Details

clusterx operator performs a fast clustering by automatic search and find of density peaks.

See Also

clusterx

Examples

clusterx operator

Description

clusterx operator performs a fast clustering by automatic search and find of density peaks

Usage
Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis is the measurement value
Input parameters .
dimReduction type of reduction to perform, pca, tsne, NULL, default is NULL
outDim number of demensions to return, default 2
Output relations .
cluster character, returns a cluster id per column (e.g. per cell)
Details

clusterx operator performs a fast clustering by automatic search and find of density peaks.

See Also

clusterx

Examples

Fast t-SNE Docker Operator

Build the image
VERSION=0.0.1
docker build -t tercen/fast_tSNE_docker_operator:$VERSION .
docker push tercen/fast_tSNE_docker_operator:$VERSION
git add -A && git commit -m "$VERSION" && git tag  $VERSION  && git push && git push :--tags

Fast t-SNE operator

Description

The Fast t-SNE operator performs the Fast Fourier Transform Interpolation-based t-SNE dimensionality reduction method.

Usage
Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Output relations .
tsne1, tsne2 first two components containing the new projected values
Details

The operator performs tSNE analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 2).

See Also

tsne pca

hclust operator

Description

hclust operator performs a hierarchical clustering.

Usage
Input projection .
row represents the row data
col represents the col data
y-axis is the value of measurement
Input parameters .
scale boolean, scaled to have unit variance before the analysis takes place
center boolean, shifted to be zero center before the analysis takes place
fill numeric, a fill in value for datapoints structural missings
Output relations .
rorder numeric, order of rows after clustering
corder numeric, order of cols after clustering
Details

The operator is the hclust function of the base R package.

See Also
Examples

MDS operator

Description

MDS operator performs a Multidimensional Scaling analysis.

Usage
Input projection .
y-axis numeric, distance measure
col character, dist_to variable obtained from a pairwise_distance operator
row character, variables
Output relations .
mds_1 numeric, first dimension
mds_2 numeric, second dimension
Details

The operator takes as input a pariwise distance matrix as obtained with the pairwise_distance_operator.

References

This operator is a wrapper of the cmdsale R function.

flowsom operator

Description

flowsom operator performs the SOM (self organizing maps) in the flowSOM R package.

Usage
Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_num numeric, per column (e.g. per cell)
mapping_node_label character, per column (e.g. per cell)
Details

The operator is the SOM function of the flowSOM R package.

References

see the flowSOM::SOM function of the R package for the documentation,

See Also

clusterx

Examples

pca operator

Description

pca operator performs principle component analysis.

Usage
Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Input parameters .
scale logical, indicating whether the variables should be scaled to have unit variance before the analysis takes place
center logical, indicating whether the variables should be shifted to be zero centered before the analysis takes place
na.action A function which indicates what should happen when the data contain NAs
tol numeric, indicating the magnitude below which components should be omitted. Components are omitted if their standard deviations are less than or equal to tol times the standard deviation of the first component
maxComp numeric, maximum number of components to return, default 5
Output relations .
pca1, pca2, pca3, pca4, pca5 first five components containing the new projected values
Details

The operator performs principal component analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 5).

Reference
See Also

tsne

Examples

rphenograph operator

Description

rephenograph operator performs a phenotype clustering in the Rphenograph R package.

Usage
Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_num numeric, per column (e.g. per cell)
mapping_node_label character, per column (e.g. per cell)
Details

The operator is the rphenograph function of the Rphenograh R package.

References

see the rphenograph::SOM function of the R package for the documentation,

See Also
Examples

somflow operator

Description

somflow operator performs the SOM (self organizing maps) in the FlowSOM R package.

Usage
Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_label character, per column (e.g. per cell)
Details

The operator is the SOM function of the flowSOM R package.

References

see the FlowSOM::SOM function of the R package for the documentation,

See Also

clusterx

Examples

tsne operator

Description

tsne operator performs tSNE analysis.

Usage
Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Input parameters .
dims logical, output dimensionality, default 2
initial_dims numeric, the number of dimensions that should be retained in the initial PCA step, default 50
perplexity numeric, perplexity parameter, default is 30
theta numeric, speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE, default 0.05
pca numeric, whether an initial PCA step should be performed, default TRUE
max_iter numeric, number of iteration, default 1000
pca_center logical, should data be centered before pca is applied ?
pca_scale logical, should data be scaled before pca is applied ?
stop_lying_iter numeric, Iteration after which the perplexities are no longer exaggerated
mom_switch_iter numeric, Iteration after which the final momentum is used
Output relations .
tsne1, tsne2 first two components containing the new projected values
Details

The operator performs tSNE analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 2).

Reference
See Also

pca

Examples