15 Operator Improvements and Best Practices
This chapter covers advanced techniques for improving operator quality, reliability, and maintainability. You’ll learn essential practices for logging, error handling, testing, and optimization that ensure your operators work robustly in production environments.
By the end of this chapter, you will be able to: - Implement comprehensive logging and debugging strategies - Build robust error handling and input validation - Create comprehensive test suites for operators - Apply performance optimization techniques - Follow best practices for production-ready operators
Before proceeding, ensure you’ve completed: - Basic Implementation chapter for core operator concepts - Advanced Features chapter for complex functionality - Data Input and Output Patterns for data handling
15.1 Logging and Debugging
Effective logging is essential for monitoring operator behavior and diagnosing issues in production environments.
15.1.1 Basic Logging
Implement logging for production operators:
$log("Your message.") ctx
"Your message.") ctx.log(
- Log key milestones: Start/end of major operations
- Include data metrics: Row counts, processing times, memory usage
- Log parameter values: Help reproduce issues with specific inputs
- Use structured formats: Enable easier log parsing and analysis
- Avoid logging sensitive data: Protect user privacy and security
15.2 Error Handling
Implement comprehensive error handling that provides helpful feedback to users:
# Comprehensive error handling with user-friendly messages
<- function(ctx) {
robust_operator tryCatch({
# Validate inputs first
validate_inputs(ctx)
# Main processing with progress logging
$log(paste("[INFO]", Sys.time(), "- Beginning data analysis"))
ctx
# Check for edge cases
<- ctx$select(.ri, .ci, .y)
data
if (any(is.infinite(data$.y))) {
$log(paste("[WARNING]", Sys.time(), "- Infinite values detected, removing them"))
ctx<- data[is.finite(data$.y), ]
data
}
if (nrow(data) == 0) {
stop("No valid data remaining after cleaning")
}
# Perform analysis
<- perform_analysis(data)
result
$log(paste("[INFO]", Sys.time(), "- Analysis completed successfully"))
ctxreturn(result)
error = function(e) {
}, # Log the technical error
$log(paste("[ERROR]", Sys.time(), "- Technical error:", e$message))
ctx
# Provide user-friendly error message
if (grepl("projection.*required", e$message)) {
stop("Please ensure you have dragged the required data columns to the appropriate axes.")
else if (grepl("data points required", e$message)) {
} stop("This analysis requires at least 3 data points. Please check your data selection.")
else if (grepl("values must vary", e$message)) {
} stop("The data values do not vary enough for this analysis. Please check your input data.")
else {
} stop(paste("An error occurred during analysis:", e$message))
}
}) }
def robust_operator(tercen_ctx):
"""Operator with comprehensive error handling"""
try:
# Validate inputs first
validate_inputs(tercen_ctx)
# Main processing with progress logging
f"[INFO] {datetime.now()} - Beginning data analysis")
tercen_ctx.log(
# Check for edge cases
= tercen_ctx.select(['.ri', '.ci', '.y'], df_lib="polars")
df
# Handle infinite values
= df.filter(pl.col('.y').is_infinite()).height
infinite_count if infinite_count > 0:
f"[WARNING] {datetime.now()} - Infinite values detected ({infinite_count}), removing them")
tercen_ctx.log(= df.filter(pl.col('.y').is_finite())
df
if len(df) == 0:
raise ValueError("No valid data remaining after cleaning")
# Perform analysis
= perform_analysis(df)
result
f"[INFO] {datetime.now()} - Analysis completed successfully")
tercen_ctx.log(return result
except ValueError as ve:
f"[ERROR] {datetime.now()} - Validation error: {str(ve)}")
tercen_ctx.log(
# Provide user-friendly error messages
if "projection" in str(ve) and "required" in str(ve):
raise ValueError("Please ensure you have dragged the required data columns to the appropriate axes.")
elif "data points required" in str(ve):
raise ValueError("This analysis requires at least 3 data points. Please check your data selection.")
elif "values must vary" in str(ve):
raise ValueError("The data values do not vary enough for this analysis. Please check your input data.")
else:
raise ValueError(f"An error occurred during analysis: {str(ve)}")
except Exception as e:
f"[ERROR] {datetime.now()} - Unexpected error: {str(e)}")
tercen_ctx.log(raise ValueError(f"An unexpected error occurred. Please check your data and try again. Error: {str(e)}")
- Validate early: Check inputs before expensive computations
- Fail gracefully: Provide clear, actionable error messages
- Log technical details: Help with debugging while keeping user messages simple
- Handle edge cases: Account for missing data, infinite values, empty datasets
- Test error scenarios: Ensure error handling works as expected
15.3 Testing and Validation
Comprehensive testing ensures your operator works correctly across different data scenarios and edge cases.
Tercen supports two main testing frameworks: 1. Unit Tests: Simple data files with expected input/output and test specifications 2. Integration Tests: Actual Tercen workflows triggered to perform computations
15.3.1 Unit Test Structure
Create a tests
directory in your operator repository with the following structure:
tests/
├── input.csv # Sample input data
├── output.csv # Expected output data
├── test.json # Test configuration
For multiple test scenarios, use numbered files: - test_1.json
, test_2.json
for different parameter settings - input_1.csv
, input_2.csv
for different data scenarios
15.3.2 Creating Comprehensive Test Data
Design test cases that cover various scenarios:
Test Scenario | Purpose | Example Data |
---|---|---|
Normal Case | Standard operation | Regular numeric data with good distribution |
Edge Cases | Boundary conditions | Minimum data points, extreme values |
Error Cases | Invalid inputs | Missing data, wrong data types |
15.3.3 Test Configuration File
Create comprehensive test.json
files to specify how Tercen should run your tests:
Basic Test Configuration:
{
"kind": "OperatorUnitTest",
"name": "regression_test_basic",
"namespace": "test",
"inputDataUri": "input.csv",
"outputDataUri": ["output.csv"],
"columns": [],
"rows": [],
"colors": [],
"labels": [],
"yAxis": ".y",
"xAxis": ".x",
"properties": {
"intercept.omit": false,
"confidence.level": 0.95
}
}
This completes our comprehensive guide to Tercen operator development. You now have all the tools and knowledge needed to create robust, efficient, and user-friendly operators that extend Tercen’s analytical capabilities!