7.3 KiB
Meta API Grabber
Async data collection system for Meta's Marketing API with TimescaleDB time-series storage and dashboard support.
docker build . -t gitea.99tales.net/jonas/meta_grabber:lastest
Features
- OAuth2 Authentication - Automated token generation flow
- TimescaleDB Integration - Optimized time-series database for ad metrics
- Scheduled Collection - Periodic data grabbing (every 2 hours recommended)
- Metadata Caching - Smart caching of accounts, campaigns, and ad sets
- Async/await architecture for efficient API calls
- Conservative rate limiting (2s between requests, 1 concurrent request)
- Multi-level insights - Account, campaign, and ad set data
- Dashboard Ready - Includes Grafana setup for visualization
- Continuous Aggregates - Pre-computed hourly/daily rollups
- Data Compression - Automatic compression of older data
Quick Start
1. Install Dependencies
uv sync
2. Start TimescaleDB
docker-compose up -d
This starts:
- TimescaleDB on port 5432 (PostgreSQL-compatible)
- Grafana on port 3000 (for dashboards)
3. Configure Credentials
cp .env.example .env
Edit .env and add:
- META_APP_ID and META_APP_SECRET from Meta for Developers
- META_AD_ACCOUNT_ID from Meta Ads Manager (format:
act_1234567890) - DATABASE_URL is pre-configured for local Docker setup
4. Get Long-Lived Access Token
OAuth2 Flow (Recommended - Gets 60-day token)
uv run python src/meta_api_grabber/auth.py
This will:
- Open OAuth2 authorization in your browser
- Exchange the code for a short-lived token
- Automatically exchange for a long-lived token (60 days)
- Save token to
.env - Save token metadata to
.meta_token.json(for auto-refresh)
Manual Token (Not Recommended)
- Get a token from Graph API Explorer
- Add it to
.envasMETA_ACCESS_TOKEN - Note: Manual tokens won't have auto-refresh capability
5. Start Scheduled Collection
uv run python src/meta_api_grabber/scheduled_grabber.py
This will:
- Automatically refresh tokens before they expire (checks every cycle)
- Collect data every 2 hours using the
todaydate preset (recommended by Meta) - Cache metadata (accounts, campaigns, ad sets) twice daily
- Store time-series data in TimescaleDB
- Use upsert strategy to handle updates
Usage Modes
1. Scheduled Collection (Recommended for Dashboards)
uv run python src/meta_api_grabber/scheduled_grabber.py
- Runs continuously, collecting data every 2 hours
- Stores data in TimescaleDB for dashboard visualization
- Uses
todaydate preset (recommended by Meta) - Caches metadata to reduce API calls
2. One-Time Data Export (JSON)
uv run python src/meta_api_grabber/insights_grabber.py
- Fetches insights for the last 7 days
- Saves to
data/meta_insights_TIMESTAMP.json - Good for ad-hoc analysis or testing
3. OAuth2 Authentication
uv run python src/meta_api_grabber/auth.py
- Interactive flow to get long-lived token (60 days)
- Saves token to
.envand metadata to.meta_token.json
4. Check Token Status
uv run python src/meta_api_grabber/token_manager.py
- Shows token expiry and validity
- Manually refresh if needed
Data Collected
Account Level
- Impressions, clicks, spend
- CPC, CPM, CTR
- Reach, frequency
- Actions and cost per action
Campaign Level (top 10)
- Campaign name and ID
- Impressions, clicks, spend
- CTR, CPC
Ad Set Level (top 10)
- Ad set name and ID
- Impressions, clicks, spend
- CTR, CPM
Database Schema
Time-Series Tables (Hypertables)
- account_insights - Account-level metrics over time
- campaign_insights - Campaign-level metrics over time
- adset_insights - Ad set level metrics over time
Metadata Tables (Cached)
- ad_accounts - Account metadata
- campaigns - Campaign metadata
- adsets - Ad set metadata
Continuous Aggregates
- account_insights_hourly - Hourly rollups
- account_insights_daily - Daily rollups
Features
- Automatic partitioning by day (chunk_time_interval = 1 day)
- Compression for data older than 7 days
- Indexes on account_id, campaign_id, adset_id + time
- Upsert strategy to handle duplicate/updated data
Dashboard Setup
Access Grafana
- Open http://localhost:3000
- Login with
admin/admin - Add TimescaleDB as data source:
- Type: PostgreSQL
- Host:
timescaledb:5432 - Database:
meta_insights - User:
meta_user - Password:
meta_password - TLS/SSL Mode: disable
Example Queries
Latest Account Metrics:
SELECT * FROM latest_account_metrics WHERE account_id = 'act_your_id';
Campaign Performance (Last 24h):
SELECT * FROM campaign_performance_24h ORDER BY total_spend DESC;
Hourly Trend:
SELECT bucket, avg_impressions, avg_clicks, avg_spend
FROM account_insights_hourly
WHERE account_id = 'act_your_id'
AND bucket >= NOW() - INTERVAL '7 days'
ORDER BY bucket;
Rate Limiting & Backoff
The system implements Meta's best practices for rate limiting:
Intelligent Rate Limiting
- Monitors
x-fb-ads-insights-throttleheader from every API response - Tracks both app-level and account-level usage percentages
- Auto-throttles when usage exceeds 75%
- Progressive delays based on usage (75%: 2x, 85%: 3x, 90%: 5x, 95%: 10x)
Exponential Backoff
- Automatic retries on rate limit errors (up to 5 attempts)
- Exponential backoff: 2s → 4s → 8s → 16s → 32s
- Max backoff: 5 minutes
- Recognizes Meta error codes 17 and 80004
Conservative Defaults
- 2 seconds base delay between API requests
- 1 concurrent request at a time
- Top 50 campaigns/adsets per collection
- 2 hour intervals between scheduled collections
Best Practices Applied
Based on Meta's official recommendations:
- ✅ Monitor rate limit headers
- ✅ Pace queries with wait times
- ✅ Implement backoff when approaching limits
- ✅ Use date presets (e.g., 'today') instead of custom ranges
- ✅ Limit query scope and metrics
Token Management
Automatic Token Refresh
The system automatically manages token lifecycle:
Token Types:
- Short-lived tokens: Valid for 1-2 hours (obtained from OAuth)
- Long-lived tokens: Valid for 60 days (automatically exchanged)
Auto-Refresh Logic:
- OAuth flow automatically exchanges for 60-day token
- Token metadata saved to
.meta_token.json(includes expiry) - Scheduled grabber checks token before each cycle
- Auto-refreshes when < 7 days remaining
- New token saved and API reinitialized seamlessly
Files Created:
.env- ContainsMETA_ACCESS_TOKEN(updated on refresh).meta_token.json- Token metadata (expiry, issued_at, etc.)- Both files are gitignored for security
Manual Token Operations:
Check token status:
uv run python src/meta_api_grabber/token_manager.py
Re-authenticate (if token expires):
uv run python src/meta_api_grabber/auth.py
Long-Running Collection: The scheduled grabber runs indefinitely without manual intervention. Token refresh happens automatically every ~53 days (7 days before the 60-day expiry).