Multi-Worker Quick Reference

TL;DR

Problem: Using 4 workers causes duplicate emails and race conditions.

Solution: File-based locking ensures only ONE worker runs schedulers.

Commands

# Development (1 worker - auto primary)
uvicorn alpine_bits_python.api:app --reload

# Production (4 workers - one becomes primary)
uvicorn alpine_bits_python.api:app --workers 4 --host 0.0.0.0 --port 8000

# Test worker coordination
uv run python test_worker_coordination.py

# Run all tests
uv run pytest tests/ -v

Check Which Worker is Primary

Look for startup logs:

[INFO] Worker startup: pid=1001, primary=True   ← PRIMARY
[INFO] Worker startup: pid=1002, primary=False  ← SECONDARY
[INFO] Worker startup: pid=1003, primary=False  ← SECONDARY
[INFO] Worker startup: pid=1004, primary=False  ← SECONDARY
[INFO] Daily report scheduler started           ← Only on PRIMARY

Lock File

Location: /tmp/alpinebits_primary_worker.lock

Check lock status:

# See which PID holds the lock
cat /tmp/alpinebits_primary_worker.lock
# Output: 1001

# Verify process is running
ps aux | grep 1001

Clean stale lock (if needed):

rm /tmp/alpinebits_primary_worker.lock
# Then restart application

What Runs Where

Service	Primary Worker	Secondary Workers
HTTP requests	✓ Yes	✓ Yes
Email scheduler	✓ Yes	✗ No
Error alerts	✓ Yes	✓ Yes (all workers can send)
DB migrations	✓ Yes	✗ No
Customer hashing	✓ Yes	✗ No

Troubleshooting

All workers think they're primary

Cause: Lock file not accessible Fix: Check permissions on /tmp/ or change lock location

No worker becomes primary

Cause: Stale lock file Fix: rm /tmp/alpinebits_primary_worker.lock and restart

Still getting duplicate emails

Check: Are you seeing duplicate scheduled reports or error alerts?

Scheduled reports should only come from primary ✓
Error alerts can come from any worker (by design) ✓

Code Example

from alpine_bits_python.worker_coordination import is_primary_worker

async def lifespan(app: FastAPI):
    # Acquire lock - only one worker succeeds
    is_primary, worker_lock = is_primary_worker()

    if is_primary:
        # Start singleton services
        scheduler.start()

    # All workers handle requests
    yield

    # Release lock on shutdown
    if worker_lock:
        worker_lock.release()

Documentation

Full guide: docs/MULTI_WORKER_DEPLOYMENT.md
Solution summary: SOLUTION_SUMMARY.md
Implementation: src/alpine_bits_python/worker_coordination.py
Test script: test_worker_coordination.py

2.7 KiB Raw Blame History