classDiagram class Design { Define input-output Choose projection Plan computations } class RepositorySetup { Initialise GitHub repo } class DevelopmentEnvironment { Load repo Prepare Tercen step } class Implementation { Connect to Tercen data Write computational functions } class Testing { Create unit tests Validate with sample data } class Documentation { Write usage instructions Populate operator metadata and specs } class Deployment { Control dependencies Release to library } class Maintenance { Get feedback Fix bugs Add features } Design --> RepositorySetup RepositorySetup --> DevelopmentEnvironment DevelopmentEnvironment --> Implementation Implementation --> Testing Testing --> Documentation Documentation --> Deployment Deployment --> Maintenance Maintenance --> Implementation
5 Operator Design Principles
The foundation of every successful operator lies in careful design. This chapter covers the fundamental design principles and planning considerations that ensure your operator integrates seamlessly with Tercen’s data projection system and provides computational value to users.
- Tercen’s data model and projection system
- Common operator patterns and their use cases
- Input projection design strategies
- Output relation planning
- Design validation checklist
5.1 Understanding Tercen’s Data Model
Tercen operates on a fundamental principle:
Every operator receives data from a Tercen workflow through the crosstab projection as input, and returns tables (with relations to input data) as output.
This design ensures that operators can be chained together in complex analytical workflows while maintaining data lineage and relationships.
5.2 Development Workflow Overview
Creating a Tercen operator follows a structured, iterative workflow designed to ensure reliability, maintainability, and user-friendliness. The development process consists of eight key phases, with continuous iteration between implementation, testing, and maintenance:
5.3 Input Projection Design
The input projection defines what data your operator will receive. This projection is configured in Tercen’s data step and determines the structure of your input table.
Common projection patterns:
Projection Type | Components | Use Case | Example |
---|---|---|---|
Cell-wise Operations | y-axis , row , col |
Compute a value per cell | Mean, median, custom statistics, normalization |
Row-wise Operations | y-axis , row |
Compute a value per observation | Clustering, dimension reduction, outlier detection |
Column-wise Operations | y-axis , col |
Compute a value per variable | Feature importance, column statistics, data loading |
Global Operations | y-axis |
Compute across all data | Global statistics, model fitting, data export |
5.4 Output Relation Strategy
The output relation defines how your computed results relate back to the input data:
Results are computed for each unique combination of row and column factors.
Example: Computing mean values per experimental condition.
Input: Multiple measurements (projected onto the crosstab y axis) per condition (projected onto the rows and columns)
Output: One mean value per condition
Results are computed across all rows for each column.
Example: Clustering samples based on feature profiles.
Input: Feature matrix (genes × samples)
Output: Cluster assignments per sample
Results are computed across all columns for each row.
Example: Gene-wise statistics across samples.
Input: Expression matrix (genes × samples)
Output: Statistics per gene
5.5 Design Checklist
Before writing any code, ensure you can clearly answer these fundamental questions:
- Start simple and add complexity gradually
- Consider how your operator will compose with others in workflows
- Design for reusability across different data types and use cases
- Document your design decisions for future reference
5.6 Next Steps
Once you have a clear design for your operator, the next step is setting up your development repository. Continue to the next chapter to learn about repository setup and project structure.