CalejoControl/docs/alert_system_setup.md

291 lines
8.2 KiB
Markdown
Raw Permalink Normal View History

# Alert System Setup Guide
## Overview
The Calejo Control Adapter includes a comprehensive multi-channel alert system that can notify operators of safety events, system failures, and operational issues through multiple channels:
- **Email Alerts** - For all safety events
- **SMS Alerts** - For critical events only
- **Webhook Integration** - For external monitoring systems
- **SCADA Alarms** - For HMI integration (Phase 4)
## Current Implementation Status
### ✅ Fully Implemented
- **Alert routing and management framework**
- **Email sending logic** (requires SMTP configuration)
- **Webhook integration** (requires endpoint configuration)
- **Alert history and statistics**
- **Multi-channel coordination**
### ⚠️ Requires External Configuration
- **SMS Integration** - Needs Twilio account and billing setup
- **Email Integration** - Needs real SMTP server credentials
- **Webhook Integration** - Needs real webhook endpoints
- **SCADA Integration** - Planned for Phase 4
## Configuration Guide
### 1. Email Alert Setup
#### Prerequisites
- SMTP server access (Gmail, Office 365, or company SMTP)
- Valid email account credentials
#### Configuration Steps
1. **Update environment variables** in `config/.env`:
```bash
# Email Configuration
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
SMTP_USERNAME=your-email@gmail.com
SMTP_PASSWORD=your-app-password # Use app password for Gmail
SMTP_USE_TLS=true
```
2. **Update settings** in `config/settings.py`:
```python
alert_email_enabled: bool = True
alert_email_from: str = "calejo-control@your-company.com"
alert_email_recipients: List[str] = ["operator1@company.com", "operator2@company.com"]
```
3. **For Gmail users**:
- Enable 2-factor authentication
- Generate an "App Password" for the application
- Use the app password instead of your regular password
#### Testing Email Configuration
```python
# Test email configuration
from src.monitoring.alerts import AlertManager
from config.settings import Settings
settings = Settings()
alert_manager = AlertManager(settings)
# Send test alert
await alert_manager.send_alert(
alert_type="TEST_ALERT",
severity="INFO",
message="Test email configuration",
context={"test": "email"}
)
```
### 2. SMS Alert Setup (Twilio)
#### Prerequisites
- Twilio account (https://www.twilio.com)
- Verified phone numbers for recipients
- Billing information (SMS costs money)
#### Configuration Steps
1. **Get Twilio credentials**:
- Sign up for Twilio account
- Get Account SID and Auth Token from dashboard
- Get your Twilio phone number
2. **Update environment variables** in `config/.env`:
```bash
# Twilio Configuration
TWILIO_ACCOUNT_SID=ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+15551234567
```
3. **Update settings** in `config/settings.py`:
```python
alert_sms_enabled: bool = True
alert_sms_recipients: List[str] = ["+393401234567", "+393407654321"]
```
4. **Implement SMS sending** (currently only logs):
- The current implementation only logs SMS alerts
- To enable actual SMS sending, uncomment and implement the Twilio integration in `src/monitoring/alerts.py`:
```python
# In _send_sms_alert method, replace the TODO section with:
import twilio
from twilio.rest import Client
client = Client(self.settings.twilio_account_sid, self.settings.twilio_auth_token)
for phone_number in self.settings.alert_sms_recipients:
message = client.messages.create(
body=f"[{alert_data['severity']}] {alert_data['message']}",
from_=self.settings.twilio_phone_number,
to=phone_number
)
```
### 3. Webhook Integration
#### Prerequisites
- Webhook endpoint URL
- Authentication token (if required)
#### Configuration Steps
1. **Update environment variables** in `config/.env`:
```bash
# Webhook Configuration
ALERT_WEBHOOK_URL=https://your-monitoring-system.com/webhook
ALERT_WEBHOOK_TOKEN=your_bearer_token
```
2. **Update settings** in `config/settings.py`:
```python
alert_webhook_enabled: bool = True
alert_webhook_url: str = "https://your-monitoring-system.com/webhook"
alert_webhook_token: str = "your_bearer_token"
```
#### Webhook Payload Format
When an alert is triggered, the webhook receives a JSON payload:
```json
{
"alert_type": "SAFETY_VIOLATION",
"severity": "ERROR",
"message": "Speed limit exceeded",
"context": {
"requested_speed": 55.0,
"max_speed": 50.0
},
"station_id": "STATION_001",
"pump_id": "PUMP_001",
"timestamp": 1234567890.0,
"app_name": "Calejo Control Adapter",
"app_version": "2.0.0"
}
```
### 4. SCADA Alarm Integration (Phase 4)
**Status**: Planned for Phase 4 implementation
When implemented, SCADA alarms will:
- Trigger alarms in SCADA HMI systems via OPC UA
- Provide visual and audible alerts in control rooms
- Integrate with existing alarm management systems
## Alert Types and Severity
### Alert Types
- `SAFETY_VIOLATION` - Safety limit exceeded
- `FAILSAFE_ACTIVATED` - Failsafe mode activated
- `EMERGENCY_STOP` - Emergency stop activated
- `SYSTEM_ERROR` - System or communication error
- `WATCHDOG_TIMEOUT` - Database update timeout
### Severity Levels
- `INFO` - Informational messages
- `WARNING` - Non-critical warnings
- `ERROR` - Errors requiring attention
- `CRITICAL` - Critical failures requiring immediate action
## Testing the Alert System
### Unit Tests
Run the comprehensive alert system tests:
```bash
pytest tests/unit/test_alerts.py -v
```
### Manual Testing
```python
# Test all alert channels
from src.monitoring.alerts import AlertManager
from config.settings import Settings
settings = Settings()
alert_manager = AlertManager(settings)
# Test different alert types and severities
test_alerts = [
("SAFETY_VIOLATION", "ERROR", "Speed limit exceeded"),
("FAILSAFE_ACTIVATED", "CRITICAL", "Failsafe mode activated"),
("SYSTEM_ERROR", "WARNING", "Communication timeout"),
]
for alert_type, severity, message in test_alerts:
result = await alert_manager.send_alert(
alert_type=alert_type,
severity=severity,
message=message,
context={"test": True}
)
print(f"{alert_type}: {result}")
```
## Troubleshooting
### Common Issues
1. **Email not sending**:
- Check SMTP server credentials
- Verify TLS/SSL settings
- Check firewall rules for outbound SMTP
2. **SMS not working**:
- Verify Twilio account is active and funded
- Check phone numbers are verified in Twilio
- Ensure SMS integration is implemented (currently only logs)
3. **Webhook failures**:
- Verify webhook URL is accessible
- Check authentication tokens
- Monitor webhook server logs
4. **No alerts being sent**:
- Check alert channels are enabled in settings
- Verify alert system is initialized in main application
- Check application logs for alert-related errors
### Logging
Alert system activities are logged with the following events:
- `alert_sent` - Alert successfully sent
- `email_alert_failed` - Email delivery failed
- `sms_alert_failed` - SMS delivery failed
- `webhook_alert_failed` - Webhook delivery failed
- `scada_alert_failed` - SCADA alarm failed
## Security Considerations
- Store SMTP and Twilio credentials securely (environment variables)
- Use app passwords instead of regular passwords for email
- Rotate authentication tokens regularly
- Monitor alert system for abuse or excessive alerts
- Implement rate limiting if needed
## Monitoring and Maintenance
### Alert Statistics
Use the built-in statistics to monitor alert patterns:
```python
alert_manager = AlertManager(settings)
stats = alert_manager.get_alert_stats()
print(f"Total alerts: {stats['total_alerts']}")
print(f"Severity counts: {stats['severity_counts']}")
print(f"Type counts: {stats['type_counts']}")
```
### Alert History
Review recent alerts:
```python
recent_alerts = alert_manager.get_alert_history(limit=50)
for alert in recent_alerts:
print(f"{alert['timestamp']} - {alert['alert_type']}: {alert['message']}")
```
## Next Steps
1. **Immediate**: Configure email and webhook for basic alerting
2. **Short-term**: Implement Twilio SMS integration if needed
3. **Long-term**: Implement SCADA OPC UA alarm integration in Phase 4
For questions or issues with alert system setup, refer to the application logs or contact the development team.