Files
alpinebits_python/QUICK_REFERENCE.md
2025-10-15 10:07:42 +02:00

109 lines
2.7 KiB
Markdown

# Multi-Worker Quick Reference
## TL;DR
**Problem**: Using 4 workers causes duplicate emails and race conditions.
**Solution**: File-based locking ensures only ONE worker runs schedulers.
## Commands
```bash
# Development (1 worker - auto primary)
uvicorn alpine_bits_python.api:app --reload
# Production (4 workers - one becomes primary)
uvicorn alpine_bits_python.api:app --workers 4 --host 0.0.0.0 --port 8000
# Test worker coordination
uv run python test_worker_coordination.py
# Run all tests
uv run pytest tests/ -v
```
## Check Which Worker is Primary
Look for startup logs:
```
[INFO] Worker startup: pid=1001, primary=True ← PRIMARY
[INFO] Worker startup: pid=1002, primary=False ← SECONDARY
[INFO] Worker startup: pid=1003, primary=False ← SECONDARY
[INFO] Worker startup: pid=1004, primary=False ← SECONDARY
[INFO] Daily report scheduler started ← Only on PRIMARY
```
## Lock File
**Location**: `/tmp/alpinebits_primary_worker.lock`
**Check lock status**:
```bash
# See which PID holds the lock
cat /tmp/alpinebits_primary_worker.lock
# Output: 1001
# Verify process is running
ps aux | grep 1001
```
**Clean stale lock** (if needed):
```bash
rm /tmp/alpinebits_primary_worker.lock
# Then restart application
```
## What Runs Where
| Service | Primary Worker | Secondary Workers |
|---------|---------------|-------------------|
| HTTP requests | ✓ Yes | ✓ Yes |
| Email scheduler | ✓ Yes | ✗ No |
| Error alerts | ✓ Yes | ✓ Yes (all workers can send) |
| DB migrations | ✓ Yes | ✗ No |
| Customer hashing | ✓ Yes | ✗ No |
## Troubleshooting
### All workers think they're primary
**Cause**: Lock file not accessible
**Fix**: Check permissions on `/tmp/` or change lock location
### No worker becomes primary
**Cause**: Stale lock file
**Fix**: `rm /tmp/alpinebits_primary_worker.lock` and restart
### Still getting duplicate emails
**Check**: Are you seeing duplicate **scheduled reports** or **error alerts**?
- Scheduled reports should only come from primary ✓
- Error alerts can come from any worker (by design) ✓
## Code Example
```python
from alpine_bits_python.worker_coordination import is_primary_worker
async def lifespan(app: FastAPI):
# Acquire lock - only one worker succeeds
is_primary, worker_lock = is_primary_worker()
if is_primary:
# Start singleton services
scheduler.start()
# All workers handle requests
yield
# Release lock on shutdown
if worker_lock:
worker_lock.release()
```
## Documentation
- **Full guide**: `docs/MULTI_WORKER_DEPLOYMENT.md`
- **Solution summary**: `SOLUTION_SUMMARY.md`
- **Implementation**: `src/alpine_bits_python/worker_coordination.py`
- **Test script**: `test_worker_coordination.py`