Files
alpinebits_python/docs/EMAIL_MONITORING.md
2025-10-15 08:46:25 +02:00

13 KiB

Email Monitoring and Alerting

This document describes the email monitoring and alerting system for the AlpineBits Python server.

Overview

The email monitoring system provides two main features:

  1. Error Alerts: Automatic email notifications when errors occur in the application
  2. Daily Reports: Scheduled daily summary emails with statistics and error logs

Architecture

Components

How It Works

Error Alerts (Hybrid Approach)

The EmailAlertHandler uses a hybrid threshold + time-based approach:

  1. Immediate Alerts: If the error threshold is reached (e.g., 5 errors), an alert email is sent immediately
  2. Buffered Alerts: Otherwise, errors accumulate in a buffer and are sent after the buffer duration (e.g., 15 minutes)
  3. Cooldown Period: After sending an alert, the system waits for a cooldown period before sending another alert to prevent spam

Flow Diagram:

Error occurs
    ↓
Add to buffer
    ↓
Buffer >= threshold? ──Yes──> Send immediate alert
    ↓ No                            ↓
Wait for buffer time           Reset buffer
    ↓                               ↓
Send buffered alert            Enter cooldown
    ↓
Reset buffer

Daily Reports

The DailyReportScheduler runs as a background task that:

  1. Waits until the configured send time (e.g., 8:00 AM)
  2. Collects statistics from the application
  3. Gathers errors that occurred during the day
  4. Formats and sends an email report
  5. Clears the error log
  6. Schedules the next report for the following day

Configuration

Email Configuration Keys

Add the following to your config.yaml:

email:
  # SMTP server configuration
  smtp:
    host: "smtp.gmail.com"           # Your SMTP server hostname
    port: 587                        # SMTP port (587 for TLS, 465 for SSL)
    username: !secret EMAIL_USERNAME # SMTP username (use !secret for env vars)
    password: !secret EMAIL_PASSWORD # SMTP password (use !secret for env vars)
    use_tls: true                    # Use STARTTLS encryption
    use_ssl: false                   # Use SSL/TLS from start (mutually exclusive with use_tls)

  # Sender information
  from_address: "noreply@99tales.com"
  from_name: "AlpineBits Monitor"

  # Monitoring and alerting
  monitoring:
    # Daily report configuration
    daily_report:
      enabled: true                  # Enable/disable daily reports
      recipients:
        - "admin@99tales.com"
        - "dev@99tales.com"
      send_time: "08:00"             # Time to send (24h format, local time)
      include_stats: true            # Include application statistics
      include_errors: true           # Include error summary

    # Error alert configuration
    error_alerts:
      enabled: true                  # Enable/disable error alerts
      recipients:
        - "alerts@99tales.com"
        - "oncall@99tales.com"
      error_threshold: 5             # Send immediate alert after N errors
      buffer_minutes: 15             # Wait N minutes before sending buffered errors
      cooldown_minutes: 15           # Wait N minutes before sending another alert
      log_levels:                    # Log levels to monitor
        - "ERROR"
        - "CRITICAL"

Environment Variables

For security, store sensitive credentials in environment variables:

# Create a .env file (never commit this!)
EMAIL_USERNAME=your-smtp-username@gmail.com
EMAIL_PASSWORD=your-smtp-app-password

The annotatedyaml library automatically loads values marked with !secret from environment variables.

Gmail Configuration

If using Gmail, you need to:

  1. Enable 2-factor authentication on your Google account
  2. Generate an "App Password" for SMTP access
  3. Use the app password as EMAIL_PASSWORD

Gmail Settings:

smtp:
  host: "smtp.gmail.com"
  port: 587
  use_tls: true
  use_ssl: false

Other SMTP Providers

SendGrid:

smtp:
  host: "smtp.sendgrid.net"
  port: 587
  username: "apikey"
  password: !secret SENDGRID_API_KEY
  use_tls: true

AWS SES:

smtp:
  host: "email-smtp.us-east-1.amazonaws.com"
  port: 587
  username: !secret AWS_SES_USERNAME
  password: !secret AWS_SES_PASSWORD
  use_tls: true

Usage

Automatic Error Monitoring

Once configured, the system automatically captures all ERROR and CRITICAL log messages:

from alpine_bits_python.logging_config import get_logger

_LOGGER = get_logger(__name__)

# This error will be captured and sent via email
_LOGGER.error("Database connection failed")

# This will also be captured
try:
    risky_operation()
except Exception:
    _LOGGER.exception("Operation failed")  # Includes stack trace

Triggering Test Alerts

To test your email configuration, you can manually trigger errors:

import logging

_LOGGER = logging.getLogger(__name__)

# Generate multiple errors to trigger immediate alert (if threshold = 5)
for i in range(5):
    _LOGGER.error(f"Test error {i + 1}")

Daily Report Statistics

To include custom statistics in daily reports, set a stats collector function:

async def collect_stats():
    """Collect application statistics for daily report."""
    return {
        "total_reservations": await count_reservations(),
        "new_customers": await count_new_customers(),
        "active_hotels": await count_active_hotels(),
        "api_requests": get_request_count(),
    }

# Register the collector
report_scheduler = app.state.report_scheduler
if report_scheduler:
    report_scheduler.set_stats_collector(collect_stats)

Email Templates

Error Alert Email

Subject: 🚨 AlpineBits Error Alert: 5 errors (threshold exceeded)

Body:

Error Alert - 2025-10-15 14:30:45
======================================================================

Alert Type: Immediate Alert
Error Count: 5
Time Range: 14:25:00 to 14:30:00
Reason: (threshold of 5 exceeded)

======================================================================

Errors:
----------------------------------------------------------------------

[2025-10-15 14:25:12] ERROR: Database connection timeout
  Module: db:245 (alpine_bits_python.db)

[2025-10-15 14:26:34] ERROR: Failed to process reservation
  Module: api:567 (alpine_bits_python.api)
  Exception:
  Traceback (most recent call last):
    ...

----------------------------------------------------------------------
Generated by AlpineBits Email Monitoring at 2025-10-15 14:30:45

Daily Report Email

Subject: AlpineBits Daily Report - 2025-10-15

Body (HTML):

AlpineBits Daily Report
Date: 2025-10-15

Statistics
┌────────────────────────┬────────┐
│ Metric                 │ Value  │
├────────────────────────┼────────┤
│ total_reservations     │ 42     │
│ new_customers          │ 15     │
│ active_hotels          │ 4      │
│ api_requests           │ 1,234  │
└────────────────────────┴────────┘

Errors (3)
┌──────────────┬──────────┬─────────────────────────┐
│ Time         │ Level    │ Message                 │
├──────────────┼──────────┼─────────────────────────┤
│ 08:15:23     │ ERROR    │ Connection timeout      │
│ 12:45:10     │ ERROR    │ Invalid form data       │
│ 18:30:00     │ CRITICAL │ Database unavailable    │
└──────────────┴──────────┴─────────────────────────┘

Generated by AlpineBits Server

Monitoring and Troubleshooting

Check Email Configuration

from alpine_bits_python.email_service import create_email_service
from alpine_bits_python.config_loader import load_config

config = load_config()
email_service = create_email_service(config)

if email_service:
    print("✓ Email service configured")
else:
    print("✗ Email service not configured")

Test Email Sending

import asyncio
from alpine_bits_python.email_service import EmailService, EmailConfig

async def test_email():
    config = EmailConfig({
        "smtp": {
            "host": "smtp.gmail.com",
            "port": 587,
            "username": "your-email@gmail.com",
            "password": "your-app-password",
            "use_tls": True,
        },
        "from_address": "sender@example.com",
        "from_name": "Test",
    })

    service = EmailService(config)

    result = await service.send_email(
        recipients=["recipient@example.com"],
        subject="Test Email",
        body="This is a test email from AlpineBits server.",
    )

    if result:
        print("✓ Email sent successfully")
    else:
        print("✗ Email sending failed")

asyncio.run(test_email())

Common Issues

Issue: "Authentication failed"

  • Verify SMTP username and password are correct
  • For Gmail, ensure you're using an App Password, not your regular password
  • Check that 2FA is enabled on Gmail

Issue: "Connection timeout"

  • Verify SMTP host and port are correct
  • Check firewall rules allow outbound SMTP connections
  • Try using port 465 with SSL instead of 587 with TLS

Issue: "No email alerts received"

  • Check that enabled: true in config
  • Verify recipient email addresses are correct
  • Check application logs for email sending errors
  • Ensure errors are being logged at ERROR or CRITICAL level

Issue: "Too many emails being sent"

  • Increase cooldown_minutes to reduce alert frequency
  • Increase buffer_minutes to batch more errors together
  • Increase error_threshold to only alert on serious issues

Performance Considerations

SMTP is Blocking

Email sending uses the standard Python smtplib, which performs blocking I/O. To prevent blocking the async event loop:

  • Email operations are automatically run in a thread pool executor
  • This happens transparently via loop.run_in_executor()
  • No performance impact on request handling

Memory Usage

  • Error buffer size is limited by buffer_minutes duration
  • Old errors are automatically cleared after sending
  • Daily report error log is cleared after each report
  • Typical memory usage: <1 MB for error buffering

Error Handling

  • Email sending failures are logged but never crash the application
  • If SMTP is unavailable, errors are logged to console/file as normal
  • The logging handler has exception safety - it will never cause application failures

Security Considerations

  1. Never commit credentials to git

    • Use !secret annotation in YAML
    • Store credentials in environment variables
    • Add .env to .gitignore
  2. Use TLS/SSL encryption

    • Always set use_tls: true or use_ssl: true
    • Never send credentials in plaintext
  3. Limit email recipients

    • Only send alerts to authorized personnel
    • Use dedicated monitoring email addresses
    • Consider using distribution lists
  4. Sensitive data in logs

    • Be careful not to log passwords, API keys, or PII
    • Error messages in emails may contain sensitive context
    • Review log messages before enabling email alerts

Testing

Run the test suite:

# Test email service only
uv run pytest tests/test_email_service.py -v

# Test with coverage
uv run pytest tests/test_email_service.py --cov=alpine_bits_python.email_service --cov=alpine_bits_python.email_monitoring

Future Enhancements

Potential improvements for future versions:

  • Support for email templates (Jinja2)
  • Configurable retry logic for failed sends
  • Email queuing for high-volume scenarios
  • Integration with external monitoring services (PagerDuty, Slack)
  • Weekly/monthly report options
  • Custom alert rules based on error patterns
  • Email attachments for detailed logs
  • HTML email styling improvements

References