# Timestamp Logic for Meta Insights Data ## Overview The system now uses intelligent timestamp assignment based on the `date_preset` and account timezone to ensure accurate day-by-day plotting while handling Meta's timezone-based data reporting. ## Key Concepts ### Meta's Timezone Behavior Meta API reports data based on the **ad account's timezone**: - "today" = today in the account's timezone - "yesterday" = yesterday in the account's timezone - An account in `America/Los_Angeles` (PST/PDT) will have different "today" dates than an account in `Europe/London` (GMT/BST) ### The Timestamp Challenge When storing time-series data, we need timestamps that: 1. Reflect the actual date of the data (not when we fetched it) 2. Account for the ad account's timezone 3. Allow for accurate day-by-day plotting 4. Use current time for "today" (live, constantly updating data) 5. Use historical timestamps for past data (fixed point in time) ## Implementation ### The `_compute_timestamp()` Method Located in [scheduled_grabber.py](src/meta_api_grabber/scheduled_grabber.py), this method computes the appropriate timestamp for each data point: ```python def _compute_timestamp( self, date_preset: str, date_start_str: Optional[str], account_timezone: str ) -> datetime: """ Compute the appropriate timestamp for storing insights data. For 'today': Use current time (data is live, constantly updating) For historical presets: Use noon of that date in the account's timezone, then convert to UTC for storage """ ``` ### Logic Flow #### For "today" Data: ``` date_preset = "today" ↓ Use datetime.now(timezone.utc) ↓ Store with current timestamp ↓ Multiple fetches during the day overwrite each other (database ON CONFLICT updates existing records) ``` **Why**: Today's data changes throughout the day. Using the current time ensures we can see when data was last updated. #### For Historical Data (e.g., "yesterday"): ``` date_preset = "yesterday" date_start = "2025-10-20" account_timezone = "America/Los_Angeles" ↓ Create datetime: 2025-10-20 12:00:00 in PST ↓ Convert to UTC: 2025-10-20 19:00:00 UTC (PST is UTC-7 in summer) ↓ Store with this timestamp ↓ Data point will plot on the correct day ``` **Why**: Historical data is fixed. Using noon in the account's timezone ensures: 1. The timestamp falls on the correct calendar day 2. Timezone differences don't cause data to appear on wrong days 3. Consistent time (noon) for all historical data points ### Timezone Handling Account timezones are: 1. **Cached during metadata collection** in the `ad_accounts` table 2. **Retrieved from database** using `_get_account_timezone()` 3. **Cached in memory** to avoid repeated database queries Example timezone conversion: ```python # Account in Los Angeles (PST/PDT = UTC-8/UTC-7) date_start = "2025-10-20" # Yesterday in account timezone account_tz = ZoneInfo("America/Los_Angeles") # Create datetime at noon LA time timestamp_local = datetime(2025, 10, 20, 12, 0, 0, tzinfo=account_tz) # Result: 2025-10-20 12:00:00-07:00 (PDT) # Convert to UTC for storage timestamp_utc = timestamp_local.astimezone(timezone.utc) # Result: 2025-10-20 19:00:00+00:00 (UTC) ``` ## Examples ### Example 1: Same Account, Multiple Days **Ad Account**: `act_123` in `America/New_York` (EST = UTC-5) **Scenario**: - Fetch "yesterday" data on Oct 21, 2025 - `date_start` from API: `"2025-10-20"` **Timestamp Calculation**: ``` 2025-10-20 12:00:00 EST (noon in NY) ↓ convert to UTC 2025-10-20 17:00:00 UTC (stored in database) ``` **Result**: Data plots on October 20 regardless of viewer's timezone ### Example 2: Different Timezones **Account A**: `America/Los_Angeles` (PDT = UTC-7) **Account B**: `Europe/London` (BST = UTC+1) Both fetch "yesterday" on Oct 21, 2025: | Account | date_start | Local Time | UTC Stored | |---------|-----------|------------|------------| | A (LA) | 2025-10-20 | 12:00 PDT | 19:00 UTC | | B (London) | 2025-10-20 | 12:00 BST | 11:00 UTC | **Result**: Both plot on October 20, even though stored at different UTC times ### Example 3: "Today" Data Updates **Account**: Any timezone **Fetches**: Every 2 hours | Fetch Time (UTC) | date_preset | date_start | Stored Timestamp | |-----------------|-------------|------------|------------------| | 08:00 UTC | "today" | 2025-10-21 | 08:00 UTC (current) | | 10:00 UTC | "today" | 2025-10-21 | 10:00 UTC (current) | | 12:00 UTC | "today" | 2025-10-21 | 12:00 UTC (current) | **Result**: Latest data always has the most recent timestamp, showing when it was fetched ## Database Schema Implications ### Primary Key Constraint All insights tables use: ```sql PRIMARY KEY (time, account_id) -- or (time, campaign_id), etc. ``` With `ON CONFLICT DO UPDATE`: ```sql INSERT INTO account_insights (time, account_id, ...) VALUES (...) ON CONFLICT (time, account_id) DO UPDATE SET impressions = EXCLUDED.impressions, spend = EXCLUDED.spend, ... ``` ### Behavior by Date Preset **"today" data**: - Multiple fetches in same day have different timestamps - No conflicts (different `time` values) - Creates multiple rows, building time-series - Can see data evolution throughout the day **"yesterday" data**: - All fetches use same timestamp (noon in account TZ) - Conflicts occur (same `time` value) - Updates existing row with fresh data - Only keeps latest version ## Querying Data ### Query by Day (Recommended) ```sql -- Get all data for a specific date range SELECT DATE(time AT TIME ZONE 'America/Los_Angeles') as data_date, account_id, AVG(spend) as avg_spend, MAX(impressions) as max_impressions FROM account_insights WHERE time >= '2025-10-15' AND time < '2025-10-22' GROUP BY data_date, account_id ORDER BY data_date DESC; ``` ### Filter by Date Preset ```sql -- Get only historical (yesterday) data SELECT * FROM account_insights WHERE date_preset = 'yesterday' ORDER BY time DESC; -- Get only live (today) data SELECT * FROM account_insights WHERE date_preset = 'today' ORDER BY time DESC; ``` ## Plotting Considerations When creating day-by-day plots: ### Option 1: Use `date_start` Field ```sql SELECT date_start, -- Already a DATE type SUM(spend) as total_spend FROM account_insights GROUP BY date_start ORDER BY date_start; ``` ### Option 2: Extract Date from Timestamp ```sql SELECT DATE(time) as data_date, -- Convert timestamp to date SUM(spend) as total_spend FROM account_insights GROUP BY data_date ORDER BY data_date; ``` ### For "Today" Data (Multiple Points Per Day) ```sql -- Get latest "today" data for each account SELECT DISTINCT ON (account_id) account_id, time, spend, impressions FROM account_insights WHERE date_preset = 'today' ORDER BY account_id, time DESC; ``` ## Benefits 1. **Accurate Day Assignment**: Historical data always plots on correct calendar day 2. **Timezone Aware**: Respects Meta's timezone-based reporting 3. **Live Updates**: "Today" data shows progression throughout the day 4. **Historical Accuracy**: Yesterday data uses consistent timestamp 5. **Update Tracking**: Can see when "yesterday" data was last refreshed 6. **Query Flexibility**: Can query by date_start or extract date from time ## Troubleshooting ### Data Appears on Wrong Day **Symptom**: Yesterday's data shows on wrong day in graphs **Cause**: Timezone not being considered **Solution**: Already handled! Our `_compute_timestamp()` uses account timezone ### Multiple Entries for Yesterday **Symptom**: Multiple rows for same account and yesterday's date **Cause**: Database conflict resolution not working **Check**: - Primary key includes `time` and `account_id` - ON CONFLICT clause exists in insert statements - Timestamp is actually the same (should be: noon in account TZ) ### Timezone Errors **Symptom**: `ZoneInfo` errors or invalid timezone names **Cause**: Invalid timezone in database or missing timezone data **Solution**: Code falls back to UTC if timezone is invalid ```python except Exception as e: print(f"Warning: Could not parse timezone '{account_timezone}': {e}") return datetime.now(timezone.utc) ``` ## Summary The timestamp logic ensures: - ✅ "Today" data uses current time (live updates) - ✅ Historical data uses noon in account's timezone - ✅ Timezone conversions handled automatically - ✅ Data plots correctly day-by-day - ✅ Account timezone cached for performance - ✅ Fallback handling for missing/invalid timezones This provides accurate, timezone-aware time-series data ready for visualization!