14 KiB
Calejo Control Adapter - Safety Framework
Overview
The Calejo Control Adapter implements a comprehensive multi-layer safety framework designed to prevent equipment damage, operational hazards, and ensure reliable pump station operation under all conditions, including system failures, communication loss, and cyber attacks.
Safety Philosophy: "Safety First" - All setpoints must pass through safety enforcement before reaching SCADA systems.
Multi-Layer Safety Architecture
Three-Layer Safety Model
┌─────────────────────────────────────────────────────────┐
│ Layer 3: Optimization Constraints (Calejo Optimize) │
│ - Economic optimization bounds: 25-45 Hz │
│ - Energy efficiency constraints │
│ - Production optimization limits │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Layer 2: Station Safety Limits (Control Adapter) │
│ - Database-enforced limits: 20-50 Hz │
│ - Rate of change limiting │
│ - Emergency stop integration │
│ - Failsafe mechanisms │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Layer 1: Physical Hard Limits (PLC/VFD) │
│ - Hardware-enforced limits: 15-55 Hz │
│ - Physical safety mechanisms │
│ - Equipment protection │
└─────────────────────────────────────────────────────────┘
Safety Components
1. Safety Limit Enforcer (src/core/safety.py)
Purpose
The Safety Limit Enforcer is the LAST line of defense before setpoints are exposed to SCADA systems. ALL setpoints MUST pass through this enforcer.
Key Features
-
Multi-Layer Limit Enforcement:
- Hard operational limits (speed, level, power, flow)
- Rate of change limiting
- Emergency stop integration
- Failsafe mode activation
-
Safety Limit Types:
@dataclass class SafetyLimits: hard_min_speed_hz: float # Minimum speed limit (Hz) hard_max_speed_hz: float # Maximum speed limit (Hz) hard_min_level_m: Optional[float] # Minimum level limit (meters) hard_max_level_m: Optional[float] # Maximum level limit (meters) hard_max_power_kw: Optional[float] # Maximum power limit (kW) max_speed_change_hz_per_min: float # Rate of change limit
Enforcement Process
def enforce_setpoint(station_id: str, pump_id: str, setpoint: float) -> Tuple[float, List[str]]:
"""
Enforce safety limits on setpoint.
Returns:
Tuple of (enforced_setpoint, violations)
- enforced_setpoint: Safe setpoint (clamped if necessary)
- violations: List of safety violations (for logging/alerting)
"""
# 1. Check emergency stop first (highest priority)
if emergency_stop_active:
return (0.0, ["EMERGENCY_STOP_ACTIVE"])
# 2. Enforce hard speed limits
if setpoint < hard_min_speed_hz:
enforced_setpoint = hard_min_speed_hz
violations.append("BELOW_MIN_SPEED")
elif setpoint > hard_max_speed_hz:
enforced_setpoint = hard_max_speed_hz
violations.append("ABOVE_MAX_SPEED")
# 3. Enforce rate of change limits
rate_violation = check_rate_of_change(previous_setpoint, enforced_setpoint)
if rate_violation:
enforced_setpoint = limit_rate_of_change(previous_setpoint, enforced_setpoint)
violations.append("RATE_OF_CHANGE_VIOLATION")
# 4. Return safe setpoint
return (enforced_setpoint, violations)
2. Emergency Stop Manager (src/core/emergency_stop.py)
Purpose
Provides manual override capability for emergency situations with highest priority override of all other controls.
Emergency Stop Levels
-
Station-Level Emergency Stop:
- Stops all pumps in a station
- Activated by station operators
- Requires manual reset
-
Pump-Level Emergency Stop:
- Stops individual pumps
- Activated for specific equipment issues
- Individual reset capability
Emergency Stop Features
- Immediate Action: Setpoints forced to 0 Hz immediately
- Audit Logging: All emergency operations logged
- Manual Reset: Requires explicit operator action to clear
- Status Monitoring: Real-time emergency stop status
- Integration: Seamless integration with safety framework
Emergency Stop API
class EmergencyStopManager:
def activate_emergency_stop(self, station_id: str, pump_id: Optional[str] = None):
"""Activate emergency stop for station or specific pump."""
def clear_emergency_stop(self, station_id: str, pump_id: Optional[str] = None):
"""Clear emergency stop condition."""
def is_emergency_stop_active(self, station_id: str, pump_id: Optional[str] = None) -> bool:
"""Check if emergency stop is active."""
3. Database Watchdog (src/monitoring/watchdog.py)
Purpose
Ensures database connectivity and activates failsafe mode if updates stop, preventing stale or unsafe setpoints.
Watchdog Features
- Periodic Health Checks: Continuous database connectivity monitoring
- Failsafe Activation: Automatic activation on connectivity loss
- Graceful Degradation: Safe fallback to default setpoints
- Alert Generation: Immediate notification on watchdog activation
- Auto-Recovery: Automatic recovery when connectivity restored
Watchdog Configuration
class DatabaseWatchdog:
def __init__(self, db_client, alert_manager, timeout_seconds: int):
"""
Args:
timeout_seconds: Time without updates before failsafe activation
"""
4. Rate of Change Limiting
Purpose
Prevents sudden speed changes that could damage pumps or cause operational issues.
Implementation
def check_rate_of_change(self, previous_setpoint: float, new_setpoint: float) -> bool:
"""Check if rate of change exceeds limits."""
change_per_minute = abs(new_setpoint - previous_setpoint) * 60
return change_per_minute > self.max_speed_change_hz_per_min
def limit_rate_of_change(self, previous_setpoint: float, new_setpoint: float) -> float:
"""Limit setpoint change to safe rate."""
max_change = self.max_speed_change_hz_per_min / 60 # Convert to per-second
if new_setpoint > previous_setpoint:
return min(new_setpoint, previous_setpoint + max_change)
else:
return max(new_setpoint, previous_setpoint - max_change)
Safety Configuration
Database Schema for Safety Limits
-- Safety limits table
CREATE TABLE safety_limits (
station_id VARCHAR(50) NOT NULL,
pump_id VARCHAR(50) NOT NULL,
hard_min_speed_hz DECIMAL(5,2) NOT NULL,
hard_max_speed_hz DECIMAL(5,2) NOT NULL,
hard_min_level_m DECIMAL(6,2),
hard_max_level_m DECIMAL(6,2),
hard_max_power_kw DECIMAL(8,2),
max_speed_change_hz_per_min DECIMAL(5,2) NOT NULL,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (station_id, pump_id)
);
-- Emergency stop status table
CREATE TABLE emergency_stop_status (
station_id VARCHAR(50) NOT NULL,
pump_id VARCHAR(50),
active BOOLEAN NOT NULL DEFAULT FALSE,
activated_at TIMESTAMP,
activated_by VARCHAR(100),
reason TEXT,
PRIMARY KEY (station_id, COALESCE(pump_id, 'STATION'))
);
Configuration Parameters
Safety Limits Configuration
safety_limits:
default_hard_min_speed_hz: 20.0
default_hard_max_speed_hz: 50.0
default_max_speed_change_hz_per_min: 30.0
# Per-station overrides
station_overrides:
station_001:
hard_min_speed_hz: 25.0
hard_max_speed_hz: 48.0
station_002:
hard_min_speed_hz: 22.0
hard_max_speed_hz: 52.0
Watchdog Configuration
watchdog:
timeout_seconds: 1200 # 20 minutes
check_interval_seconds: 60
failsafe_setpoints:
default_speed_hz: 30.0
station_overrides:
station_001: 35.0
station_002: 28.0
Safety Procedures
Emergency Stop Procedures
Activation Procedure
-
Operator Action:
- Access emergency stop control via REST API or dashboard
- Select station and/or specific pump
- Provide reason for emergency stop
- Confirm activation
-
System Response:
- Immediate setpoint override to 0 Hz
- Audit log entry with timestamp and operator
- Alert notification to configured channels
- Safety status update in all protocol servers
Clearance Procedure
-
Operator Action:
- Access emergency stop control
- Verify safe conditions for restart
- Clear emergency stop condition
- Confirm clearance
-
System Response:
- Resume normal setpoint calculation
- Audit log entry for clearance
- Alert notification of system restoration
- Safety status update
Failsafe Mode Activation
Automatic Activation Conditions
-
Database Connectivity Loss:
- Watchdog timeout exceeded
- No successful database updates
- Automatic failsafe activation
-
Safety Framework Failure:
- Safety limit enforcer unresponsive
- Emergency stop manager failure
- Component health check failures
Failsafe Behavior
- Default Setpoints: Pre-configured safe setpoints
- Limited Functionality: Basic operational mode
- Alert Generation: Immediate notification of failsafe activation
- Auto-Recovery: Automatic return to normal operation when safe
Safety Testing & Validation
Unit Testing
class TestSafetyFramework:
def test_emergency_stop_override(self):
"""Test that emergency stop overrides all other controls."""
def test_speed_limit_enforcement(self):
"""Test that speed limits are properly enforced."""
def test_rate_of_change_limiting(self):
"""Test that rate of change limits are enforced."""
def test_failsafe_activation(self):
"""Test failsafe mode activation on watchdog timeout."""
Integration Testing
class TestSafetyIntegration:
def test_end_to_end_safety_workflow(self):
"""Test complete safety workflow from optimization to SCADA."""
def test_emergency_stop_integration(self):
"""Test emergency stop integration with all components."""
def test_watchdog_integration(self):
"""Test watchdog integration with alert system."""
Validation Procedures
Safety Validation Checklist
- All setpoints pass through safety enforcer
- Emergency stop overrides all controls
- Rate of change limits are enforced
- Failsafe mode activates on connectivity loss
- Audit logging captures all safety events
- Alert system notifies on safety violations
Performance Validation
- Response Time: Safety enforcement < 10ms per setpoint
- Emergency Stop: Immediate activation (< 100ms)
- Watchdog: Timely detection of connectivity issues
- Recovery: Graceful recovery from failure conditions
Safety Compliance & Certification
Regulatory Compliance
IEC 61508 / IEC 61511
- Safety Integrity Level (SIL): Designed for SIL 2 requirements
- Fault Tolerance: Redundant safety mechanisms
- Failure Analysis: Comprehensive failure mode analysis
- Safety Validation: Rigorous testing and validation
Industry Standards
- Water/Wastewater: Compliance with industry safety standards
- Municipal Operations: Alignment with municipal safety requirements
- Equipment Protection: Protection of pump and motor equipment
Safety Certification Process
Documentation Requirements
- Safety Requirements Specification (SRS)
- Safety Manual
- Validation Test Reports
- Safety Case Documentation
Testing & Validation
- Safety Function Testing
- Failure Mode Testing
- Integration Testing
- Operational Testing
Safety Monitoring & Reporting
Real-Time Safety Monitoring
Safety Status Dashboard
- Current safety limits for each pump
- Emergency stop status
- Rate of change monitoring
- Watchdog status
- Safety violation history
Safety Metrics
- Safety enforcement statistics
- Emergency stop activations
- Rate of change violations
- Failsafe mode activations
- Response time metrics
Safety Reporting
Daily Safety Reports
- Safety violations summary
- Emergency stop events
- System health status
- Compliance metrics
Compliance Reports
- Safety framework performance
- Regulatory compliance status
- Certification maintenance
- Audit trail verification
Incident Response & Recovery
Safety Incident Response
Incident Classification
- Critical: Equipment damage risk or safety hazard
- Major: Operational impact or safety violation
- Minor: Safety system warnings or alerts
Response Procedures
- Immediate Action: Activate emergency stop if required
- Investigation: Analyze safety violation details
- Correction: Implement corrective actions
- Documentation: Complete incident report
- Prevention: Update safety procedures if needed
System Recovery
Recovery Procedures
- Verify safety system integrity
- Clear emergency stop conditions
- Resume normal operations
- Monitor system performance
- Validate safety enforcement
This safety framework documentation provides comprehensive guidance on the safety mechanisms, procedures, and compliance requirements for the Calejo Control Adapter. All safety-critical operations must follow these documented procedures.