5  Operator Design Principles

The foundation of every successful operator lies in careful design. This chapter covers the fundamental design principles and planning considerations that ensure your operator integrates seamlessly with Tercen’s data projection system and provides computational value to users.

What You’ll Learn
  • Tercen’s data model and projection system
  • Common operator patterns and their use cases
  • Input projection design strategies
  • Output relation planning
  • Design validation checklist

5.1 Understanding Tercen’s Data Model

Tercen operates on a fundamental principle:

Core Principle

Every operator receives data from a Tercen workflow through the crosstab projection as input, and returns tables (with relations to input data) as output.

This design ensures that operators can be chained together in complex analytical workflows while maintaining data lineage and relationships.

5.2 Development Workflow Overview

Creating a Tercen operator follows a structured, iterative workflow designed to ensure reliability, maintainability, and user-friendliness. The development process consists of eight key phases, with continuous iteration between implementation, testing, and maintenance:

classDiagram
        class Design {
            Define input-output
            Choose projection
            Plan computations
        }
        class RepositorySetup {
            Initialise GitHub repo
        }
        class DevelopmentEnvironment {
            Load repo
            Prepare Tercen step
        }
        class Implementation {
            Connect to Tercen data
            Write computational functions
        }
        class Testing {
            Create unit tests
            Validate with sample data
        }
        class Documentation {
            Write usage instructions
            Populate operator metadata and specs
        }
        class Deployment {
            Control dependencies
            Release to library
        }
        class Maintenance {
            Get feedback
            Fix bugs
            Add features
        }

        Design --> RepositorySetup
        RepositorySetup --> DevelopmentEnvironment
        DevelopmentEnvironment --> Implementation
        Implementation --> Testing
        Testing --> Documentation
        Documentation --> Deployment
        Deployment --> Maintenance
        Maintenance --> Implementation

5.3 Input Projection Design

The input projection defines what data your operator will receive. This projection is configured in Tercen’s data step and determines the structure of your input table.

Common projection patterns:

Projection Type Components Use Case Example
Cell-wise Operations y-axis, row, col Compute a value per cell Mean, median, custom statistics, normalization
Row-wise Operations y-axis, row Compute a value per observation Clustering, dimension reduction, outlier detection
Column-wise Operations y-axis, col Compute a value per variable Feature importance, column statistics, data loading
Global Operations y-axis Compute across all data Global statistics, model fitting, data export

5.4 Output Relation Strategy

The output relation defines how your computed results relate back to the input data:

Results are computed for each unique combination of row and column factors.

Example: Computing mean values per experimental condition.

Input: Multiple measurements (projected onto the crosstab y axis) per condition (projected onto the rows and columns)
Output: One mean value per condition

Results are computed across all rows for each column.

Example: Clustering samples based on feature profiles.

Input: Feature matrix (genes × samples)
Output: Cluster assignments per sample

Results are computed across all columns for each row.

Example: Gene-wise statistics across samples.

Input: Expression matrix (genes × samples)  
Output: Statistics per gene

5.5 Design Checklist

Before writing any code, ensure you can clearly answer these fundamental questions:

Design Best Practices
  • Start simple and add complexity gradually
  • Consider how your operator will compose with others in workflows
  • Design for reusability across different data types and use cases
  • Document your design decisions for future reference

5.6 Next Steps

Once you have a clear design for your operator, the next step is setting up your development repository. Continue to the next chapter to learn about repository setup and project structure.