2025-10-21 11:55:14 +02:00
2025-10-21 10:54:21 +02:00
2025-10-21 11:55:14 +02:00
2025-10-21 10:54:21 +02:00
2025-10-21 10:54:21 +02:00
2025-10-21 11:55:14 +02:00
2025-10-21 11:55:14 +02:00
2025-10-21 11:55:14 +02:00

Meta API Grabber

Async data collection system for Meta's Marketing API with TimescaleDB time-series storage and dashboard support.

Features

  • OAuth2 Authentication - Automated token generation flow
  • TimescaleDB Integration - Optimized time-series database for ad metrics
  • Scheduled Collection - Periodic data grabbing (every 2 hours recommended)
  • Metadata Caching - Smart caching of accounts, campaigns, and ad sets
  • Async/await architecture for efficient API calls
  • Conservative rate limiting (2s between requests, 1 concurrent request)
  • Multi-level insights - Account, campaign, and ad set data
  • Dashboard Ready - Includes Grafana setup for visualization
  • Continuous Aggregates - Pre-computed hourly/daily rollups
  • Data Compression - Automatic compression of older data

Quick Start

1. Install Dependencies

uv sync

2. Start TimescaleDB

docker-compose up -d

This starts:

  • TimescaleDB on port 5432 (PostgreSQL-compatible)
  • Grafana on port 3000 (for dashboards)

3. Configure Credentials

cp .env.example .env

Edit .env and add:

  • META_APP_ID and META_APP_SECRET from Meta for Developers
  • META_AD_ACCOUNT_ID from Meta Ads Manager (format: act_1234567890)
  • DATABASE_URL is pre-configured for local Docker setup

4. Get Access Token

Option A: OAuth2 Flow (Recommended)

uv run python src/meta_api_grabber/auth.py

Follow the prompts to authorize and save your token.

Option B: Manual Token

5. Start Scheduled Collection

uv run python src/meta_api_grabber/scheduled_grabber.py

This will:

  • Collect data every 2 hours using the today date preset (recommended by Meta)
  • Cache metadata (accounts, campaigns, ad sets) twice daily
  • Store time-series data in TimescaleDB
  • Use upsert strategy to handle updates

Usage Modes

uv run python src/meta_api_grabber/scheduled_grabber.py
  • Runs continuously, collecting data every 2 hours
  • Stores data in TimescaleDB for dashboard visualization
  • Uses today date preset (recommended by Meta)
  • Caches metadata to reduce API calls

2. One-Time Data Export (JSON)

uv run python src/meta_api_grabber/insights_grabber.py
  • Fetches insights for the last 7 days
  • Saves to data/meta_insights_TIMESTAMP.json
  • Good for ad-hoc analysis or testing

3. OAuth2 Authentication

uv run python src/meta_api_grabber/auth.py
  • Interactive flow to get access token
  • Saves token to .env automatically

Data Collected

Account Level

  • Impressions, clicks, spend
  • CPC, CPM, CTR
  • Reach, frequency
  • Actions and cost per action

Campaign Level (top 10)

  • Campaign name and ID
  • Impressions, clicks, spend
  • CTR, CPC

Ad Set Level (top 10)

  • Ad set name and ID
  • Impressions, clicks, spend
  • CTR, CPM

Database Schema

Time-Series Tables (Hypertables)

  • account_insights - Account-level metrics over time
  • campaign_insights - Campaign-level metrics over time
  • adset_insights - Ad set level metrics over time

Metadata Tables (Cached)

  • ad_accounts - Account metadata
  • campaigns - Campaign metadata
  • adsets - Ad set metadata

Continuous Aggregates

  • account_insights_hourly - Hourly rollups
  • account_insights_daily - Daily rollups

Features

  • Automatic partitioning by day (chunk_time_interval = 1 day)
  • Compression for data older than 7 days
  • Indexes on account_id, campaign_id, adset_id + time
  • Upsert strategy to handle duplicate/updated data

Dashboard Setup

Access Grafana

  1. Open http://localhost:3000
  2. Login with admin / admin
  3. Add TimescaleDB as data source:
    • Type: PostgreSQL
    • Host: timescaledb:5432
    • Database: meta_insights
    • User: meta_user
    • Password: meta_password
    • TLS/SSL Mode: disable

Example Queries

Latest Account Metrics:

SELECT * FROM latest_account_metrics WHERE account_id = 'act_your_id';

Campaign Performance (Last 24h):

SELECT * FROM campaign_performance_24h ORDER BY total_spend DESC;

Hourly Trend:

SELECT bucket, avg_impressions, avg_clicks, avg_spend
FROM account_insights_hourly
WHERE account_id = 'act_your_id'
  AND bucket >= NOW() - INTERVAL '7 days'
ORDER BY bucket;

Rate Limiting

The system is configured to be very conservative:

  • 2 seconds delay between API requests
  • Only 1 concurrent request at a time
  • Top 50 campaigns/adsets per collection
  • 2 hour intervals between collections

This ensures you stay well within Meta's API rate limits.

Description
No description provided
Readme 3.6 MiB
Languages
Python 99.4%
Dockerfile 0.6%