6ba8a0dba23f549d738e88278eeaf53d3e9ac076
Meta API Grabber
Async data collection system for Meta's Marketing API with TimescaleDB time-series storage and dashboard support.
Features
- OAuth2 Authentication - Automated token generation flow
- TimescaleDB Integration - Optimized time-series database for ad metrics
- Scheduled Collection - Periodic data grabbing (every 2 hours recommended)
- Metadata Caching - Smart caching of accounts, campaigns, and ad sets
- Async/await architecture for efficient API calls
- Conservative rate limiting (2s between requests, 1 concurrent request)
- Multi-level insights - Account, campaign, and ad set data
- Dashboard Ready - Includes Grafana setup for visualization
- Continuous Aggregates - Pre-computed hourly/daily rollups
- Data Compression - Automatic compression of older data
Quick Start
1. Install Dependencies
uv sync
2. Start TimescaleDB
docker-compose up -d
This starts:
- TimescaleDB on port 5432 (PostgreSQL-compatible)
- Grafana on port 3000 (for dashboards)
3. Configure Credentials
cp .env.example .env
Edit .env and add:
- META_APP_ID and META_APP_SECRET from Meta for Developers
- META_AD_ACCOUNT_ID from Meta Ads Manager (format:
act_1234567890) - DATABASE_URL is pre-configured for local Docker setup
4. Get Access Token
Option A: OAuth2 Flow (Recommended)
uv run python src/meta_api_grabber/auth.py
Follow the prompts to authorize and save your token.
Option B: Manual Token
- Get a token from Graph API Explorer
- Add it to
.envasMETA_ACCESS_TOKEN
5. Start Scheduled Collection
uv run python src/meta_api_grabber/scheduled_grabber.py
This will:
- Collect data every 2 hours using the
todaydate preset (recommended by Meta) - Cache metadata (accounts, campaigns, ad sets) twice daily
- Store time-series data in TimescaleDB
- Use upsert strategy to handle updates
Usage Modes
1. Scheduled Collection (Recommended for Dashboards)
uv run python src/meta_api_grabber/scheduled_grabber.py
- Runs continuously, collecting data every 2 hours
- Stores data in TimescaleDB for dashboard visualization
- Uses
todaydate preset (recommended by Meta) - Caches metadata to reduce API calls
2. One-Time Data Export (JSON)
uv run python src/meta_api_grabber/insights_grabber.py
- Fetches insights for the last 7 days
- Saves to
data/meta_insights_TIMESTAMP.json - Good for ad-hoc analysis or testing
3. OAuth2 Authentication
uv run python src/meta_api_grabber/auth.py
- Interactive flow to get access token
- Saves token to
.envautomatically
Data Collected
Account Level
- Impressions, clicks, spend
- CPC, CPM, CTR
- Reach, frequency
- Actions and cost per action
Campaign Level (top 10)
- Campaign name and ID
- Impressions, clicks, spend
- CTR, CPC
Ad Set Level (top 10)
- Ad set name and ID
- Impressions, clicks, spend
- CTR, CPM
Database Schema
Time-Series Tables (Hypertables)
- account_insights - Account-level metrics over time
- campaign_insights - Campaign-level metrics over time
- adset_insights - Ad set level metrics over time
Metadata Tables (Cached)
- ad_accounts - Account metadata
- campaigns - Campaign metadata
- adsets - Ad set metadata
Continuous Aggregates
- account_insights_hourly - Hourly rollups
- account_insights_daily - Daily rollups
Features
- Automatic partitioning by day (chunk_time_interval = 1 day)
- Compression for data older than 7 days
- Indexes on account_id, campaign_id, adset_id + time
- Upsert strategy to handle duplicate/updated data
Dashboard Setup
Access Grafana
- Open http://localhost:3000
- Login with
admin/admin - Add TimescaleDB as data source:
- Type: PostgreSQL
- Host:
timescaledb:5432 - Database:
meta_insights - User:
meta_user - Password:
meta_password - TLS/SSL Mode: disable
Example Queries
Latest Account Metrics:
SELECT * FROM latest_account_metrics WHERE account_id = 'act_your_id';
Campaign Performance (Last 24h):
SELECT * FROM campaign_performance_24h ORDER BY total_spend DESC;
Hourly Trend:
SELECT bucket, avg_impressions, avg_clicks, avg_spend
FROM account_insights_hourly
WHERE account_id = 'act_your_id'
AND bucket >= NOW() - INTERVAL '7 days'
ORDER BY bucket;
Rate Limiting
The system is configured to be very conservative:
- 2 seconds delay between API requests
- Only 1 concurrent request at a time
- Top 50 campaigns/adsets per collection
- 2 hour intervals between collections
This ensures you stay well within Meta's API rate limits.
Description
Languages
Python
99.4%
Dockerfile
0.6%