# Email Monitoring and Alerting This document describes the email monitoring and alerting system for the AlpineBits Python server. ## Overview The email monitoring system provides two main features: 1. **Error Alerts**: Automatic email notifications when errors occur in the application 2. **Daily Reports**: Scheduled daily summary emails with statistics and error logs ## Architecture ### Components - **EmailService** ([email_service.py](../src/alpine_bits_python/email_service.py)): Core SMTP email sending functionality - **EmailAlertHandler** ([email_monitoring.py](../src/alpine_bits_python/email_monitoring.py)): Custom logging handler that captures errors and sends alerts - **DailyReportScheduler** ([email_monitoring.py](../src/alpine_bits_python/email_monitoring.py)): Background task that sends daily reports ### How It Works #### Error Alerts (Hybrid Approach) The `EmailAlertHandler` uses a **hybrid threshold + time-based** approach: 1. **Immediate Alerts**: If the error threshold is reached (e.g., 5 errors), an alert email is sent immediately 2. **Buffered Alerts**: Otherwise, errors accumulate in a buffer and are sent after the buffer duration (e.g., 15 minutes) 3. **Cooldown Period**: After sending an alert, the system waits for a cooldown period before sending another alert to prevent spam **Flow Diagram:** ``` Error occurs ↓ Add to buffer ↓ Buffer >= threshold? ──Yes──> Send immediate alert ↓ No ↓ Wait for buffer time Reset buffer ↓ ↓ Send buffered alert Enter cooldown ↓ Reset buffer ``` #### Daily Reports The `DailyReportScheduler` runs as a background task that: 1. Waits until the configured send time (e.g., 8:00 AM) 2. Collects statistics from the application 3. Gathers errors that occurred during the day 4. Formats and sends an email report 5. Clears the error log 6. Schedules the next report for the following day ## Configuration ### Email Configuration Keys Add the following to your [config.yaml](../config/config.yaml): ```yaml email: # SMTP server configuration smtp: host: "smtp.gmail.com" # Your SMTP server hostname port: 587 # SMTP port (587 for TLS, 465 for SSL) username: !secret EMAIL_USERNAME # SMTP username (use !secret for env vars) password: !secret EMAIL_PASSWORD # SMTP password (use !secret for env vars) use_tls: true # Use STARTTLS encryption use_ssl: false # Use SSL/TLS from start (mutually exclusive with use_tls) # Sender information from_address: "noreply@99tales.com" from_name: "AlpineBits Monitor" # Monitoring and alerting monitoring: # Daily report configuration daily_report: enabled: true # Enable/disable daily reports recipients: - "admin@99tales.com" - "dev@99tales.com" send_time: "08:00" # Time to send (24h format, local time) include_stats: true # Include application statistics include_errors: true # Include error summary # Error alert configuration error_alerts: enabled: true # Enable/disable error alerts recipients: - "alerts@99tales.com" - "oncall@99tales.com" error_threshold: 5 # Send immediate alert after N errors buffer_minutes: 15 # Wait N minutes before sending buffered errors cooldown_minutes: 15 # Wait N minutes before sending another alert log_levels: # Log levels to monitor - "ERROR" - "CRITICAL" ``` ### Environment Variables For security, store sensitive credentials in environment variables: ```bash # Create a .env file (never commit this!) EMAIL_USERNAME=your-smtp-username@gmail.com EMAIL_PASSWORD=your-smtp-app-password ``` The `annotatedyaml` library automatically loads values marked with `!secret` from environment variables. ### Gmail Configuration If using Gmail, you need to: 1. Enable 2-factor authentication on your Google account 2. Generate an "App Password" for SMTP access 3. Use the app password as `EMAIL_PASSWORD` **Gmail Settings:** ```yaml smtp: host: "smtp.gmail.com" port: 587 use_tls: true use_ssl: false ``` ### Other SMTP Providers **SendGrid:** ```yaml smtp: host: "smtp.sendgrid.net" port: 587 username: "apikey" password: !secret SENDGRID_API_KEY use_tls: true ``` **AWS SES:** ```yaml smtp: host: "email-smtp.us-east-1.amazonaws.com" port: 587 username: !secret AWS_SES_USERNAME password: !secret AWS_SES_PASSWORD use_tls: true ``` ## Usage ### Automatic Error Monitoring Once configured, the system automatically captures all `ERROR` and `CRITICAL` log messages: ```python from alpine_bits_python.logging_config import get_logger _LOGGER = get_logger(__name__) # This error will be captured and sent via email _LOGGER.error("Database connection failed") # This will also be captured try: risky_operation() except Exception: _LOGGER.exception("Operation failed") # Includes stack trace ``` ### Triggering Test Alerts To test your email configuration, you can manually trigger errors: ```python import logging _LOGGER = logging.getLogger(__name__) # Generate multiple errors to trigger immediate alert (if threshold = 5) for i in range(5): _LOGGER.error(f"Test error {i + 1}") ``` ### Daily Report Statistics To include custom statistics in daily reports, set a stats collector function: ```python async def collect_stats(): """Collect application statistics for daily report.""" return { "total_reservations": await count_reservations(), "new_customers": await count_new_customers(), "active_hotels": await count_active_hotels(), "api_requests": get_request_count(), } # Register the collector report_scheduler = app.state.report_scheduler if report_scheduler: report_scheduler.set_stats_collector(collect_stats) ``` ## Email Templates ### Error Alert Email **Subject:** 🚨 AlpineBits Error Alert: 5 errors (threshold exceeded) **Body:** ``` Error Alert - 2025-10-15 14:30:45 ====================================================================== Alert Type: Immediate Alert Error Count: 5 Time Range: 14:25:00 to 14:30:00 Reason: (threshold of 5 exceeded) ====================================================================== Errors: ---------------------------------------------------------------------- [2025-10-15 14:25:12] ERROR: Database connection timeout Module: db:245 (alpine_bits_python.db) [2025-10-15 14:26:34] ERROR: Failed to process reservation Module: api:567 (alpine_bits_python.api) Exception: Traceback (most recent call last): ... ---------------------------------------------------------------------- Generated by AlpineBits Email Monitoring at 2025-10-15 14:30:45 ``` ### Daily Report Email **Subject:** AlpineBits Daily Report - 2025-10-15 **Body (HTML):** ```html AlpineBits Daily Report Date: 2025-10-15 Statistics ┌────────────────────────┬────────┐ │ Metric │ Value │ ├────────────────────────┼────────┤ │ total_reservations │ 42 │ │ new_customers │ 15 │ │ active_hotels │ 4 │ │ api_requests │ 1,234 │ └────────────────────────┴────────┘ Errors (3) ┌──────────────┬──────────┬─────────────────────────┐ │ Time │ Level │ Message │ ├──────────────┼──────────┼─────────────────────────┤ │ 08:15:23 │ ERROR │ Connection timeout │ │ 12:45:10 │ ERROR │ Invalid form data │ │ 18:30:00 │ CRITICAL │ Database unavailable │ └──────────────┴──────────┴─────────────────────────┘ Generated by AlpineBits Server ``` ## Monitoring and Troubleshooting ### Check Email Configuration ```python from alpine_bits_python.email_service import create_email_service from alpine_bits_python.config_loader import load_config config = load_config() email_service = create_email_service(config) if email_service: print("✓ Email service configured") else: print("✗ Email service not configured") ``` ### Test Email Sending ```python import asyncio from alpine_bits_python.email_service import EmailService, EmailConfig async def test_email(): config = EmailConfig({ "smtp": { "host": "smtp.gmail.com", "port": 587, "username": "your-email@gmail.com", "password": "your-app-password", "use_tls": True, }, "from_address": "sender@example.com", "from_name": "Test", }) service = EmailService(config) result = await service.send_email( recipients=["recipient@example.com"], subject="Test Email", body="This is a test email from AlpineBits server.", ) if result: print("✓ Email sent successfully") else: print("✗ Email sending failed") asyncio.run(test_email()) ``` ### Common Issues **Issue: "Authentication failed"** - Verify SMTP username and password are correct - For Gmail, ensure you're using an App Password, not your regular password - Check that 2FA is enabled on Gmail **Issue: "Connection timeout"** - Verify SMTP host and port are correct - Check firewall rules allow outbound SMTP connections - Try using port 465 with SSL instead of 587 with TLS **Issue: "No email alerts received"** - Check that `enabled: true` in config - Verify recipient email addresses are correct - Check application logs for email sending errors - Ensure errors are being logged at ERROR or CRITICAL level **Issue: "Too many emails being sent"** - Increase `cooldown_minutes` to reduce alert frequency - Increase `buffer_minutes` to batch more errors together - Increase `error_threshold` to only alert on serious issues ## Performance Considerations ### SMTP is Blocking Email sending uses the standard Python `smtplib`, which performs blocking I/O. To prevent blocking the async event loop: - Email operations are automatically run in a thread pool executor - This happens transparently via `loop.run_in_executor()` - No performance impact on request handling ### Memory Usage - Error buffer size is limited by `buffer_minutes` duration - Old errors are automatically cleared after sending - Daily report error log is cleared after each report - Typical memory usage: <1 MB for error buffering ### Error Handling - Email sending failures are logged but never crash the application - If SMTP is unavailable, errors are logged to console/file as normal - The logging handler has exception safety - it will never cause application failures ## Security Considerations 1. **Never commit credentials to git** - Use `!secret` annotation in YAML - Store credentials in environment variables - Add `.env` to `.gitignore` 2. **Use TLS/SSL encryption** - Always set `use_tls: true` or `use_ssl: true` - Never send credentials in plaintext 3. **Limit email recipients** - Only send alerts to authorized personnel - Use dedicated monitoring email addresses - Consider using distribution lists 4. **Sensitive data in logs** - Be careful not to log passwords, API keys, or PII - Error messages in emails may contain sensitive context - Review log messages before enabling email alerts ## Testing Run the test suite: ```bash # Test email service only uv run pytest tests/test_email_service.py -v # Test with coverage uv run pytest tests/test_email_service.py --cov=alpine_bits_python.email_service --cov=alpine_bits_python.email_monitoring ``` ## Future Enhancements Potential improvements for future versions: - [ ] Support for email templates (Jinja2) - [ ] Configurable retry logic for failed sends - [ ] Email queuing for high-volume scenarios - [ ] Integration with external monitoring services (PagerDuty, Slack) - [ ] Weekly/monthly report options - [ ] Custom alert rules based on error patterns - [ ] Email attachments for detailed logs - [ ] HTML email styling improvements ## References - [Python smtplib Documentation](https://docs.python.org/3/library/smtplib.html) - [Python logging Documentation](https://docs.python.org/3/library/logging.html) - [Gmail SMTP Settings](https://support.google.com/mail/answer/7126229) - [annotatedyaml Documentation](https://github.com/yourusername/annotatedyaml)