Added email monitoring

This commit is contained in:
Jonas Linter
2025-10-15 08:46:25 +02:00
parent bb900ab1ee
commit f22684d592
11 changed files with 2279 additions and 4 deletions

423
docs/EMAIL_MONITORING.md Normal file
View File

@@ -0,0 +1,423 @@
# Email Monitoring and Alerting
This document describes the email monitoring and alerting system for the AlpineBits Python server.
## Overview
The email monitoring system provides two main features:
1. **Error Alerts**: Automatic email notifications when errors occur in the application
2. **Daily Reports**: Scheduled daily summary emails with statistics and error logs
## Architecture
### Components
- **EmailService** ([email_service.py](../src/alpine_bits_python/email_service.py)): Core SMTP email sending functionality
- **EmailAlertHandler** ([email_monitoring.py](../src/alpine_bits_python/email_monitoring.py)): Custom logging handler that captures errors and sends alerts
- **DailyReportScheduler** ([email_monitoring.py](../src/alpine_bits_python/email_monitoring.py)): Background task that sends daily reports
### How It Works
#### Error Alerts (Hybrid Approach)
The `EmailAlertHandler` uses a **hybrid threshold + time-based** approach:
1. **Immediate Alerts**: If the error threshold is reached (e.g., 5 errors), an alert email is sent immediately
2. **Buffered Alerts**: Otherwise, errors accumulate in a buffer and are sent after the buffer duration (e.g., 15 minutes)
3. **Cooldown Period**: After sending an alert, the system waits for a cooldown period before sending another alert to prevent spam
**Flow Diagram:**
```
Error occurs
Add to buffer
Buffer >= threshold? ──Yes──> Send immediate alert
↓ No ↓
Wait for buffer time Reset buffer
↓ ↓
Send buffered alert Enter cooldown
Reset buffer
```
#### Daily Reports
The `DailyReportScheduler` runs as a background task that:
1. Waits until the configured send time (e.g., 8:00 AM)
2. Collects statistics from the application
3. Gathers errors that occurred during the day
4. Formats and sends an email report
5. Clears the error log
6. Schedules the next report for the following day
## Configuration
### Email Configuration Keys
Add the following to your [config.yaml](../config/config.yaml):
```yaml
email:
# SMTP server configuration
smtp:
host: "smtp.gmail.com" # Your SMTP server hostname
port: 587 # SMTP port (587 for TLS, 465 for SSL)
username: !secret EMAIL_USERNAME # SMTP username (use !secret for env vars)
password: !secret EMAIL_PASSWORD # SMTP password (use !secret for env vars)
use_tls: true # Use STARTTLS encryption
use_ssl: false # Use SSL/TLS from start (mutually exclusive with use_tls)
# Sender information
from_address: "noreply@99tales.com"
from_name: "AlpineBits Monitor"
# Monitoring and alerting
monitoring:
# Daily report configuration
daily_report:
enabled: true # Enable/disable daily reports
recipients:
- "admin@99tales.com"
- "dev@99tales.com"
send_time: "08:00" # Time to send (24h format, local time)
include_stats: true # Include application statistics
include_errors: true # Include error summary
# Error alert configuration
error_alerts:
enabled: true # Enable/disable error alerts
recipients:
- "alerts@99tales.com"
- "oncall@99tales.com"
error_threshold: 5 # Send immediate alert after N errors
buffer_minutes: 15 # Wait N minutes before sending buffered errors
cooldown_minutes: 15 # Wait N minutes before sending another alert
log_levels: # Log levels to monitor
- "ERROR"
- "CRITICAL"
```
### Environment Variables
For security, store sensitive credentials in environment variables:
```bash
# Create a .env file (never commit this!)
EMAIL_USERNAME=your-smtp-username@gmail.com
EMAIL_PASSWORD=your-smtp-app-password
```
The `annotatedyaml` library automatically loads values marked with `!secret` from environment variables.
### Gmail Configuration
If using Gmail, you need to:
1. Enable 2-factor authentication on your Google account
2. Generate an "App Password" for SMTP access
3. Use the app password as `EMAIL_PASSWORD`
**Gmail Settings:**
```yaml
smtp:
host: "smtp.gmail.com"
port: 587
use_tls: true
use_ssl: false
```
### Other SMTP Providers
**SendGrid:**
```yaml
smtp:
host: "smtp.sendgrid.net"
port: 587
username: "apikey"
password: !secret SENDGRID_API_KEY
use_tls: true
```
**AWS SES:**
```yaml
smtp:
host: "email-smtp.us-east-1.amazonaws.com"
port: 587
username: !secret AWS_SES_USERNAME
password: !secret AWS_SES_PASSWORD
use_tls: true
```
## Usage
### Automatic Error Monitoring
Once configured, the system automatically captures all `ERROR` and `CRITICAL` log messages:
```python
from alpine_bits_python.logging_config import get_logger
_LOGGER = get_logger(__name__)
# This error will be captured and sent via email
_LOGGER.error("Database connection failed")
# This will also be captured
try:
risky_operation()
except Exception:
_LOGGER.exception("Operation failed") # Includes stack trace
```
### Triggering Test Alerts
To test your email configuration, you can manually trigger errors:
```python
import logging
_LOGGER = logging.getLogger(__name__)
# Generate multiple errors to trigger immediate alert (if threshold = 5)
for i in range(5):
_LOGGER.error(f"Test error {i + 1}")
```
### Daily Report Statistics
To include custom statistics in daily reports, set a stats collector function:
```python
async def collect_stats():
"""Collect application statistics for daily report."""
return {
"total_reservations": await count_reservations(),
"new_customers": await count_new_customers(),
"active_hotels": await count_active_hotels(),
"api_requests": get_request_count(),
}
# Register the collector
report_scheduler = app.state.report_scheduler
if report_scheduler:
report_scheduler.set_stats_collector(collect_stats)
```
## Email Templates
### Error Alert Email
**Subject:** 🚨 AlpineBits Error Alert: 5 errors (threshold exceeded)
**Body:**
```
Error Alert - 2025-10-15 14:30:45
======================================================================
Alert Type: Immediate Alert
Error Count: 5
Time Range: 14:25:00 to 14:30:00
Reason: (threshold of 5 exceeded)
======================================================================
Errors:
----------------------------------------------------------------------
[2025-10-15 14:25:12] ERROR: Database connection timeout
Module: db:245 (alpine_bits_python.db)
[2025-10-15 14:26:34] ERROR: Failed to process reservation
Module: api:567 (alpine_bits_python.api)
Exception:
Traceback (most recent call last):
...
----------------------------------------------------------------------
Generated by AlpineBits Email Monitoring at 2025-10-15 14:30:45
```
### Daily Report Email
**Subject:** AlpineBits Daily Report - 2025-10-15
**Body (HTML):**
```html
AlpineBits Daily Report
Date: 2025-10-15
Statistics
┌────────────────────────┬────────┐
│ Metric │ Value │
├────────────────────────┼────────┤
│ total_reservations │ 42 │
│ new_customers │ 15 │
│ active_hotels │ 4 │
│ api_requests │ 1,234 │
└────────────────────────┴────────┘
Errors (3)
┌──────────────┬──────────┬─────────────────────────┐
│ Time │ Level │ Message │
├──────────────┼──────────┼─────────────────────────┤
│ 08:15:23 │ ERROR │ Connection timeout │
│ 12:45:10 │ ERROR │ Invalid form data │
│ 18:30:00 │ CRITICAL │ Database unavailable │
└──────────────┴──────────┴─────────────────────────┘
Generated by AlpineBits Server
```
## Monitoring and Troubleshooting
### Check Email Configuration
```python
from alpine_bits_python.email_service import create_email_service
from alpine_bits_python.config_loader import load_config
config = load_config()
email_service = create_email_service(config)
if email_service:
print("✓ Email service configured")
else:
print("✗ Email service not configured")
```
### Test Email Sending
```python
import asyncio
from alpine_bits_python.email_service import EmailService, EmailConfig
async def test_email():
config = EmailConfig({
"smtp": {
"host": "smtp.gmail.com",
"port": 587,
"username": "your-email@gmail.com",
"password": "your-app-password",
"use_tls": True,
},
"from_address": "sender@example.com",
"from_name": "Test",
})
service = EmailService(config)
result = await service.send_email(
recipients=["recipient@example.com"],
subject="Test Email",
body="This is a test email from AlpineBits server.",
)
if result:
print("✓ Email sent successfully")
else:
print("✗ Email sending failed")
asyncio.run(test_email())
```
### Common Issues
**Issue: "Authentication failed"**
- Verify SMTP username and password are correct
- For Gmail, ensure you're using an App Password, not your regular password
- Check that 2FA is enabled on Gmail
**Issue: "Connection timeout"**
- Verify SMTP host and port are correct
- Check firewall rules allow outbound SMTP connections
- Try using port 465 with SSL instead of 587 with TLS
**Issue: "No email alerts received"**
- Check that `enabled: true` in config
- Verify recipient email addresses are correct
- Check application logs for email sending errors
- Ensure errors are being logged at ERROR or CRITICAL level
**Issue: "Too many emails being sent"**
- Increase `cooldown_minutes` to reduce alert frequency
- Increase `buffer_minutes` to batch more errors together
- Increase `error_threshold` to only alert on serious issues
## Performance Considerations
### SMTP is Blocking
Email sending uses the standard Python `smtplib`, which performs blocking I/O. To prevent blocking the async event loop:
- Email operations are automatically run in a thread pool executor
- This happens transparently via `loop.run_in_executor()`
- No performance impact on request handling
### Memory Usage
- Error buffer size is limited by `buffer_minutes` duration
- Old errors are automatically cleared after sending
- Daily report error log is cleared after each report
- Typical memory usage: <1 MB for error buffering
### Error Handling
- Email sending failures are logged but never crash the application
- If SMTP is unavailable, errors are logged to console/file as normal
- The logging handler has exception safety - it will never cause application failures
## Security Considerations
1. **Never commit credentials to git**
- Use `!secret` annotation in YAML
- Store credentials in environment variables
- Add `.env` to `.gitignore`
2. **Use TLS/SSL encryption**
- Always set `use_tls: true` or `use_ssl: true`
- Never send credentials in plaintext
3. **Limit email recipients**
- Only send alerts to authorized personnel
- Use dedicated monitoring email addresses
- Consider using distribution lists
4. **Sensitive data in logs**
- Be careful not to log passwords, API keys, or PII
- Error messages in emails may contain sensitive context
- Review log messages before enabling email alerts
## Testing
Run the test suite:
```bash
# Test email service only
uv run pytest tests/test_email_service.py -v
# Test with coverage
uv run pytest tests/test_email_service.py --cov=alpine_bits_python.email_service --cov=alpine_bits_python.email_monitoring
```
## Future Enhancements
Potential improvements for future versions:
- [ ] Support for email templates (Jinja2)
- [ ] Configurable retry logic for failed sends
- [ ] Email queuing for high-volume scenarios
- [ ] Integration with external monitoring services (PagerDuty, Slack)
- [ ] Weekly/monthly report options
- [ ] Custom alert rules based on error patterns
- [ ] Email attachments for detailed logs
- [ ] HTML email styling improvements
## References
- [Python smtplib Documentation](https://docs.python.org/3/library/smtplib.html)
- [Python logging Documentation](https://docs.python.org/3/library/logging.html)
- [Gmail SMTP Settings](https://support.google.com/mail/answer/7126229)
- [annotatedyaml Documentation](https://github.com/yourusername/annotatedyaml)

View File

@@ -0,0 +1,172 @@
# Email Monitoring Quick Start
Get email notifications for errors and daily reports in 5 minutes.
## 1. Configure SMTP Settings
Edit `config/config.yaml` and add:
```yaml
email:
smtp:
host: "smtp.gmail.com"
port: 587
username: !secret EMAIL_USERNAME
password: !secret EMAIL_PASSWORD
use_tls: true
from_address: "noreply@yourdomain.com"
from_name: "AlpineBits Monitor"
```
## 2. Set Environment Variables
Create a `.env` file in the project root:
```bash
EMAIL_USERNAME=your-email@gmail.com
EMAIL_PASSWORD=your-app-password
```
> **Note:** For Gmail, use an [App Password](https://support.google.com/accounts/answer/185833), not your regular password.
## 3. Enable Error Alerts
In `config/config.yaml`:
```yaml
email:
monitoring:
error_alerts:
enabled: true
recipients:
- "alerts@yourdomain.com"
error_threshold: 5
buffer_minutes: 15
cooldown_minutes: 15
```
**How it works:**
- Sends immediate alert after 5 errors
- Otherwise sends after 15 minutes
- Waits 15 minutes between alerts (cooldown)
## 4. Enable Daily Reports (Optional)
In `config/config.yaml`:
```yaml
email:
monitoring:
daily_report:
enabled: true
recipients:
- "admin@yourdomain.com"
send_time: "08:00"
include_stats: true
include_errors: true
```
## 5. Test Your Configuration
Run the test script:
```bash
uv run python examples/test_email_monitoring.py
```
This will:
- ✅ Send a test email
- ✅ Trigger an error alert
- ✅ Send a test daily report
## What You Get
### Error Alert Email
When errors occur, you'll receive:
```
🚨 AlpineBits Error Alert: 5 errors (threshold exceeded)
Error Count: 5
Time Range: 14:25:00 to 14:30:00
Errors:
----------------------------------------------------------------------
[2025-10-15 14:25:12] ERROR: Database connection timeout
Module: db:245
[2025-10-15 14:26:34] ERROR: Failed to process reservation
Module: api:567
Exception: ValueError: Invalid hotel code
```
### Daily Report Email
Every day at 8 AM, you'll receive:
```
📊 AlpineBits Daily Report - 2025-10-15
Statistics:
total_reservations: 42
new_customers: 15
active_hotels: 4
Errors (3):
[08:15:23] ERROR: Connection timeout
[12:45:10] ERROR: Invalid form data
[18:30:00] CRITICAL: Database unavailable
```
## Troubleshooting
### No emails received?
1. Check your SMTP credentials:
```bash
echo $EMAIL_USERNAME
echo $EMAIL_PASSWORD
```
2. Check application logs for errors:
```bash
tail -f alpinebits.log | grep -i email
```
3. Test SMTP connection manually:
```bash
uv run python -c "
import smtplib
with smtplib.SMTP('smtp.gmail.com', 587) as smtp:
smtp.starttls()
smtp.login('$EMAIL_USERNAME', '$EMAIL_PASSWORD')
print('✅ SMTP connection successful')
"
```
### Gmail authentication failed?
- Enable 2-factor authentication on your Google account
- Generate an App Password at https://myaccount.google.com/apppasswords
- Use the App Password (not your regular password)
### Too many emails?
- Increase `error_threshold` to only alert on serious issues
- Increase `buffer_minutes` to batch more errors together
- Increase `cooldown_minutes` to reduce alert frequency
## Next Steps
- Read the full [Email Monitoring Documentation](./EMAIL_MONITORING.md)
- Configure custom statistics for daily reports
- Set up multiple recipient groups
- Integrate with Slack or PagerDuty (coming soon)
## Support
For issues or questions:
- Check the [documentation](./EMAIL_MONITORING.md)
- Review [test examples](../examples/test_email_monitoring.py)
- Open an issue on GitHub