Dimensionality reduction and clustering

Approximated t-SNE

docker build -t tercen/atsne:$VERSION .
docker push tercen/atsne:$VERSION
# see operator.json :-- "container": "tercen/atsne:1.1.8"
git add -A && git commit -m "$VERSION" && git tag -a $VERSION -m "++" && git push && git push :--tags
# renv cache ~/.local/share/renv
docker run -it :--rm :--entrypoint "/bin/bash" tercen/atsne:1.1.11 

Clustering metrics operator


clustering_metrics operator returns clustering metrics.

Input projection .
row represents the variables
col represents the observations
label represents the clusters
y-axis is the value of the measurement
Output relations .
metrics character, name of the clustering metric
value numeric, value of the clustering metric

This operator is based on the clusterCrit R function.

Hierarchical clustering tree operator


clustering_tree operator returns a hierarchical clustering tree to be projected in Tercen.

Input projection .
row factor, variables to cluster
col factor, variables to cluster (dist_to variable from a dist operator)
y-axis numeric, pairwise distance (dist variable from a dist operator)
Output relations .
presence numeric, to be projected on y-axis
tree_dim1 factor, to be projected on rows
tree_dim2 factor, to be projected on columns
tip_labels factor, leaf labels, to be projected on rows

clusterx operator


clusterx operator performs a fast clustering by automatic search and find of density peaks

Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells, samples, individuals)
x-axis first axis
y-axis second axis
Input parameters .
dimReduction type of reduction to perform, pca, tsne, NULL, default is NULL
outDim number of demensions to return, default 2
Output relations .
cluster character, returns a cluster id per value, per cell

clusterx operator performs a fast clustering by automatic search and find of density peaks.

See Also



clusterx operator


clusterx operator performs a fast clustering by automatic search and find of density peaks

Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis is the measurement value
Input parameters .
dimReduction type of reduction to perform, pca, tsne, NULL, default is NULL
outDim number of demensions to return, default 2
Output relations .
cluster character, returns a cluster id per column (e.g. per cell)

clusterx operator performs a fast clustering by automatic search and find of density peaks.

See Also



Fast t-SNE Docker Operator

Build the image
docker build -t tercen/fast_tSNE_docker_operator:$VERSION .
docker push tercen/fast_tSNE_docker_operator:$VERSION
git add -A && git commit -m "$VERSION" && git tag  $VERSION  && git push && git push :--tags

Fast t-SNE operator


The Fast t-SNE operator performs the Fast Fourier Transform Interpolation-based t-SNE dimensionality reduction method.

Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Output relations .
tsne1, tsne2 first two components containing the new projected values

The operator performs tSNE analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 2).

See Also

tsne pca

hclust operator


hclust operator performs a hierarchical clustering.

Input projection .
row represents the row data
col represents the col data
y-axis is the value of measurement
Input parameters .
scale boolean, scaled to have unit variance before the analysis takes place
center boolean, shifted to be zero center before the analysis takes place
fill numeric, a fill in value for datapoints structural missings
Output relations .
rorder numeric, order of rows after clustering
corder numeric, order of cols after clustering

The operator is the hclust function of the base R package.

See Also

MDS operator


MDS operator performs a Multidimensional Scaling analysis.

Input projection .
y-axis numeric, distance measure
col character, dist_to variable obtained from a pairwise_distance operator
row character, variables
Output relations .
mds_1 numeric, first dimension
mds_2 numeric, second dimension

The operator takes as input a pariwise distance matrix as obtained with the pairwise_distance_operator.


This operator is a wrapper of the cmdsale R function.

flowsom operator


flowsom operator performs the SOM (self organizing maps) in the flowSOM R package.

Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_num numeric, per column (e.g. per cell)
mapping_node_label character, per column (e.g. per cell)

The operator is the SOM function of the flowSOM R package.


see the flowSOM::SOM function of the R package for the documentation,

See Also



pca operator


pca operator performs principle component analysis.

Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Input parameters .
scale logical, indicating whether the variables should be scaled to have unit variance before the analysis takes place
center logical, indicating whether the variables should be shifted to be zero centered before the analysis takes place
na.action A function which indicates what should happen when the data contain NAs
tol numeric, indicating the magnitude below which components should be omitted. Components are omitted if their standard deviations are less than or equal to tol times the standard deviation of the first component
maxComp numeric, maximum number of components to return, default 5
Output relations .
pca1, pca2, pca3, pca4, pca5 first five components containing the new projected values

The operator performs principal component analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 5).

See Also



rphenograph operator


rephenograph operator performs a phenotype clustering in the Rphenograph R package.

Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_num numeric, per column (e.g. per cell)
mapping_node_label character, per column (e.g. per cell)

The operator is the rphenograph function of the Rphenograh R package.


see the rphenograph::SOM function of the R package for the documentation,

See Also

somflow operator


somflow operator performs the SOM (self organizing maps) in the FlowSOM R package.

Input projection .
row represents the variables (e.g. channels, markers)
col represents the observations (e.g. cells)
y-axis is the value of measurement signal of the channel/marker
Input parameters .
xdim Width of the grid
ydim Hight of the grid
rlen Number of times to loop over the training data for each MST
mst Number of times to build an MST
alpha_start Start learning rate
alpha_end End learning rate
dstf Distance function (1=manhattan, 2=euclidean, 3=chebyshev, 4=cosine)
Output relations .
mapping_node_label character, per column (e.g. per cell)

The operator is the SOM function of the flowSOM R package.


see the FlowSOM::SOM function of the R package for the documentation,

See Also



tsne operator


tsne operator performs tSNE analysis.

Input projection .
row represents the variables (e.g. genes, channels, markers)
col represents the observations (e.g. cells, samples, individuals)
y-axis measurement value
Input parameters .
dims logical, output dimensionality, default 2
initial_dims numeric, the number of dimensions that should be retained in the initial PCA step, default 50
perplexity numeric, perplexity parameter, default is 30
theta numeric, speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE, default 0.05
pca numeric, whether an initial PCA step should be performed, default TRUE
max_iter numeric, number of iteration, default 1000
pca_center logical, should data be centered before pca is applied ?
pca_scale logical, should data be scaled before pca is applied ?
stop_lying_iter numeric, Iteration after which the perplexities are no longer exaggerated
mom_switch_iter numeric, Iteration after which the final momentum is used
Output relations .
tsne1, tsne2 first two components containing the new projected values

The operator performs tSNE analysis. It reduces the amount of variables (i.e. indicated by rows) to a lower number (default 2).

See Also

