Alembic experiments

This commit is contained in:
Jonas Linter
2025-11-18 11:04:38 +01:00
parent 10dcbae5ad
commit 5a660507d2
17 changed files with 1716 additions and 99 deletions

1
alembic/README Normal file
View File

@@ -0,0 +1 @@
Generic single-database configuration.

123
alembic/README.md Normal file
View File

@@ -0,0 +1,123 @@
# Database Migrations
This directory contains Alembic database migrations for the Alpine Bits Python Server.
## Quick Reference
### Common Commands
```bash
# Check current migration status
uv run alembic current
# Show migration history
uv run alembic history --verbose
# Upgrade to latest migration
uv run alembic upgrade head
# Downgrade one version
uv run alembic downgrade -1
# Create a new migration (auto-generate from model changes)
uv run alembic revision --autogenerate -m "description"
# Create a new empty migration (manual)
uv run alembic revision -m "description"
```
## Migration Files
### Current Migrations
1. **535b70e85b64_initial_schema.py** - Creates all base tables
2. **8edfc81558db_drop_and_recreate_conversions_tables.py** - Handles conversions table schema change
## How Migrations Work
1. Alembic tracks which migrations have been applied using the `alembic_version` table
2. When you run `alembic upgrade head`, it applies all pending migrations in order
3. Each migration has an `upgrade()` and `downgrade()` function
4. Migrations are applied transactionally (all or nothing)
## Configuration
The Alembic environment ([env.py](env.py)) is configured to:
- Read database URL from `config.yaml` or environment variables
- Support PostgreSQL schemas
- Use async SQLAlchemy (compatible with FastAPI)
- Apply migrations in the correct schema
## Best Practices
1. **Always review auto-generated migrations** - Alembic's autogenerate is smart but not perfect
2. **Test migrations on dev first** - Never run untested migrations on production
3. **Keep migrations small** - One logical change per migration
4. **Never edit applied migrations** - Create a new migration to fix issues
5. **Commit migrations to git** - Migrations are part of your code
## Creating a New Migration
When you modify models in `src/alpine_bits_python/db.py`:
```bash
# 1. Generate the migration
uv run alembic revision --autogenerate -m "add_user_preferences_table"
# 2. Review the generated file in alembic/versions/
# Look for:
# - Incorrect type changes
# - Missing indexes
# - Data that needs to be migrated
# 3. Test it
uv run alembic upgrade head
# 4. If there are issues, downgrade and fix:
uv run alembic downgrade -1
# Edit the migration file
uv run alembic upgrade head
# 5. Commit the migration file to git
git add alembic/versions/2025_*.py
git commit -m "Add user preferences table migration"
```
## Troubleshooting
### "FAILED: Target database is not up to date"
This means pending migrations need to be applied:
```bash
uv run alembic upgrade head
```
### "Can't locate revision identified by 'xxxxx'"
The alembic_version table may be out of sync. Check what's in the database:
```bash
# Connect to your database and run:
SELECT * FROM alembic_version;
```
### Migration conflicts after git merge
If two branches created migrations at the same time:
```bash
# Create a merge migration
uv run alembic merge heads -m "merge branches"
```
### Need to reset migrations (DANGEROUS - ONLY FOR DEV)
```bash
# WARNING: This will delete all data!
uv run alembic downgrade base # Removes all tables
uv run alembic upgrade head # Recreates everything
```
## More Information
- [Alembic Documentation](https://alembic.sqlalchemy.org/)
- [Alembic Tutorial](https://alembic.sqlalchemy.org/en/latest/tutorial.html)
- See [../MIGRATION_REFACTORING.md](../MIGRATION_REFACTORING.md) for details on how this project uses Alembic

125
alembic/env.py Normal file
View File

@@ -0,0 +1,125 @@
"""Alembic environment configuration for async SQLAlchemy."""
import asyncio
from logging.config import fileConfig
from alembic import context
from sqlalchemy import pool
from sqlalchemy.engine import Connection
from sqlalchemy.ext.asyncio import async_engine_from_config
# Import your models' Base to enable autogenerate
from alpine_bits_python.config_loader import load_config
from alpine_bits_python.db import (
Base,
configure_schema,
get_database_schema,
get_database_url,
)
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
if config.config_file_name is not None:
fileConfig(config.config_file_name)
# Load application config to get database URL and schema
try:
app_config = load_config()
except (FileNotFoundError, KeyError, ValueError):
# Fallback if config can't be loaded (e.g., during initial setup)
app_config = {}
# Get database URL from application config
db_url = get_database_url(app_config)
if db_url:
config.set_main_option("sqlalchemy.url", db_url)
# Get schema name from application config
schema_name = get_database_schema(app_config)
if schema_name:
# Configure schema for all tables before migrations
configure_schema(schema_name)
# add your model's MetaData object here for 'autogenerate' support
target_metadata = Base.metadata
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
version_table_schema=schema_name, # Store alembic_version in our schema
include_schemas=True,
)
with context.begin_transaction():
context.run_migrations()
def do_run_migrations(connection: Connection) -> None:
"""Run migrations with the given connection."""
context.configure(
connection=connection,
target_metadata=target_metadata,
version_table_schema=schema_name, # Store alembic_version in our schema
include_schemas=True, # Allow Alembic to work with non-default schemas
)
with context.begin_transaction():
context.run_migrations()
async def run_async_migrations() -> None:
"""Run migrations in 'online' mode using async engine.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
# Get the config section for sqlalchemy settings
configuration = config.get_section(config.config_ini_section, {})
# Add connect_args for PostgreSQL schema support if needed
if schema_name and "postgresql" in configuration.get("sqlalchemy.url", ""):
configuration["connect_args"] = {
"server_settings": {"search_path": f"{schema_name},public"}
}
# Create async engine
connectable = async_engine_from_config(
configuration,
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
async with connectable.connect() as connection:
await connection.run_sync(do_run_migrations)
await connectable.dispose()
def run_migrations_online() -> None:
"""Run migrations in 'online' mode - entry point."""
asyncio.run(run_async_migrations())
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()

28
alembic/script.py.mako Normal file
View File

@@ -0,0 +1,28 @@
"""${message}
Revision ID: ${up_revision}
Revises: ${down_revision | comma,n}
Create Date: ${create_date}
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
${imports if imports else ""}
# revision identifiers, used by Alembic.
revision: str = ${repr(up_revision)}
down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
def upgrade() -> None:
"""Upgrade schema."""
${upgrades if upgrades else "pass"}
def downgrade() -> None:
"""Downgrade schema."""
${downgrades if downgrades else "pass"}

View File

@@ -0,0 +1,320 @@
"""Baseline existing database.
This migration handles the transition from the old manual migration system
to Alembic. It:
1. Detects if the old conversions table schema exists and recreates it with the new schema
2. Acts as a no-op for all other tables (assumes they already exist)
This allows existing databases to migrate to Alembic without data loss.
Revision ID: 94134e512a12
Revises:
Create Date: 2025-11-18 10:46:12.322570
"""
from collections.abc import Sequence
import sqlalchemy as sa
from alembic import op
from sqlalchemy import inspect
# revision identifiers, used by Alembic.
revision: str = "94134e512a12"
down_revision: str | None = None
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
"""Migrate existing database to Alembic management.
This migration:
- Drops and recreates the conversions/conversion_rooms tables with new schema
- Assumes all other tables already exist (no-op for them)
"""
conn = op.get_bind()
inspector = inspect(conn)
# Get schema from alembic context (set in env.py from config)
from alpine_bits_python.config_loader import load_config
from alpine_bits_python.db import get_database_schema
try:
app_config = load_config()
schema = get_database_schema(app_config)
except Exception:
schema = None
print(f"Using schema: {schema or 'public (default)'}")
# Get tables from the correct schema
existing_tables = set(inspector.get_table_names(schema=schema))
print(f"Found existing tables in schema '{schema}': {existing_tables}")
# Handle conversions table migration
if "conversions" in existing_tables:
columns = [
col["name"] for col in inspector.get_columns("conversions", schema=schema)
]
print(f"Columns in conversions table: {columns}")
columns_set = set(columns)
print(f"DEBUG: Found columns in conversions table: {sorted(columns_set)}")
# Old schema indicators: these columns should NOT be in conversions anymore
old_schema_columns = {
"arrival_date",
"departure_date",
"room_status",
"room_number",
"sale_date",
"revenue_total",
"revenue_logis",
"revenue_board",
}
intersection = old_schema_columns & columns_set
print(f"DEBUG: Old schema columns found: {intersection}")
# If ANY of the old denormalized columns exist, this is the old schema
if intersection:
# Old schema detected, drop and recreate
print(
f"Detected old conversions schema with denormalized room data: {old_schema_columns & columns_set}"
)
# Drop conversion_rooms FIRST if it exists (due to foreign key constraint)
if "conversion_rooms" in existing_tables:
print("Dropping old conversion_rooms table...")
op.execute(
f"DROP TABLE IF EXISTS {schema}.conversion_rooms CASCADE"
if schema
else "DROP TABLE IF EXISTS conversion_rooms CASCADE"
)
print("Dropping old conversions table...")
op.execute(
f"DROP TABLE IF EXISTS {schema}.conversions CASCADE"
if schema
else "DROP TABLE IF EXISTS conversions CASCADE"
)
# Drop any orphaned indexes that may have survived the table drop
print("Dropping any orphaned indexes...")
index_names = [
"ix_conversions_advertising_campagne",
"ix_conversions_advertising_medium",
"ix_conversions_advertising_partner",
"ix_conversions_customer_id",
"ix_conversions_guest_email",
"ix_conversions_guest_first_name",
"ix_conversions_guest_last_name",
"ix_conversions_hashed_customer_id",
"ix_conversions_hotel_id",
"ix_conversions_pms_reservation_id",
"ix_conversions_reservation_id",
"ix_conversion_rooms_arrival_date",
"ix_conversion_rooms_conversion_id",
"ix_conversion_rooms_departure_date",
"ix_conversion_rooms_pms_hotel_reservation_id",
"ix_conversion_rooms_room_number",
]
for idx_name in index_names:
op.execute(
f"DROP INDEX IF EXISTS {schema}.{idx_name}" if schema else f"DROP INDEX IF EXISTS {idx_name}"
)
print("Creating new conversions table with normalized schema...")
create_conversions_table(schema)
create_conversion_rooms_table(schema)
else:
print("Conversions table already has new schema, skipping migration")
else:
# No conversions table exists, create it
print("No conversions table found, creating new schema...")
create_conversions_table(schema)
create_conversion_rooms_table(schema)
print("Baseline migration complete!")
def create_conversions_table(schema=None):
"""Create the conversions table with the new normalized schema."""
op.create_table(
"conversions",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("reservation_id", sa.Integer(), nullable=True),
sa.Column("customer_id", sa.Integer(), nullable=True),
sa.Column("hashed_customer_id", sa.Integer(), nullable=True),
sa.Column("hotel_id", sa.String(), nullable=True),
sa.Column("pms_reservation_id", sa.String(), nullable=True),
sa.Column("reservation_number", sa.String(), nullable=True),
sa.Column("reservation_date", sa.Date(), nullable=True),
sa.Column("creation_time", sa.DateTime(timezone=True), nullable=True),
sa.Column("reservation_type", sa.String(), nullable=True),
sa.Column("booking_channel", sa.String(), nullable=True),
sa.Column("guest_first_name", sa.String(), nullable=True),
sa.Column("guest_last_name", sa.String(), nullable=True),
sa.Column("guest_email", sa.String(), nullable=True),
sa.Column("guest_country_code", sa.String(), nullable=True),
sa.Column("advertising_medium", sa.String(), nullable=True),
sa.Column("advertising_partner", sa.String(), nullable=True),
sa.Column("advertising_campagne", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("updated_at", sa.DateTime(timezone=True), nullable=True),
sa.ForeignKeyConstraint(
["customer_id"],
["customers.id"],
),
sa.ForeignKeyConstraint(
["hashed_customer_id"],
["hashed_customers.id"],
),
sa.ForeignKeyConstraint(
["reservation_id"],
["reservations.id"],
),
sa.PrimaryKeyConstraint("id"),
schema=schema,
)
# Create indexes
op.create_index(
op.f("ix_conversions_advertising_campagne"),
"conversions",
["advertising_campagne"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_advertising_medium"),
"conversions",
["advertising_medium"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_advertising_partner"),
"conversions",
["advertising_partner"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_customer_id"), "conversions", ["customer_id"], unique=False
)
op.create_index(
op.f("ix_conversions_guest_email"), "conversions", ["guest_email"], unique=False
)
op.create_index(
op.f("ix_conversions_guest_first_name"),
"conversions",
["guest_first_name"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_guest_last_name"),
"conversions",
["guest_last_name"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_hashed_customer_id"),
"conversions",
["hashed_customer_id"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_hotel_id"), "conversions", ["hotel_id"], unique=False
)
op.create_index(
op.f("ix_conversions_pms_reservation_id"),
"conversions",
["pms_reservation_id"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversions_reservation_id"),
"conversions",
["reservation_id"],
unique=False,
schema=schema,
)
def create_conversion_rooms_table(schema=None):
"""Create the conversion_rooms table with the new normalized schema."""
op.create_table(
"conversion_rooms",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("conversion_id", sa.Integer(), nullable=False),
sa.Column("pms_hotel_reservation_id", sa.String(), nullable=True),
sa.Column("arrival_date", sa.Date(), nullable=True),
sa.Column("departure_date", sa.Date(), nullable=True),
sa.Column("room_status", sa.String(), nullable=True),
sa.Column("room_type", sa.String(), nullable=True),
sa.Column("room_number", sa.String(), nullable=True),
sa.Column("num_adults", sa.Integer(), nullable=True),
sa.Column("rate_plan_code", sa.String(), nullable=True),
sa.Column("connected_room_type", sa.String(), nullable=True),
sa.Column("daily_sales", sa.JSON(), nullable=True),
sa.Column("total_revenue", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("updated_at", sa.DateTime(timezone=True), nullable=True),
sa.ForeignKeyConstraint(
["conversion_id"], ["conversions.id"], ondelete="CASCADE"
),
sa.PrimaryKeyConstraint("id"),
schema=schema,
)
# Create indexes
op.create_index(
op.f("ix_conversion_rooms_arrival_date"),
"conversion_rooms",
["arrival_date"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversion_rooms_conversion_id"),
"conversion_rooms",
["conversion_id"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversion_rooms_departure_date"),
"conversion_rooms",
["departure_date"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversion_rooms_pms_hotel_reservation_id"),
"conversion_rooms",
["pms_hotel_reservation_id"],
unique=False,
schema=schema,
)
op.create_index(
op.f("ix_conversion_rooms_room_number"),
"conversion_rooms",
["room_number"],
unique=False,
schema=schema,
)
def downgrade() -> None:
"""Downgrade not supported.
This baseline migration drops data (old conversions schema) that can be
recreated from PMS XML imports. Reverting would require re-importing.
"""