Ir al contenido

Cloudflare R2 — Bucket Setup

Cloudflare R2 — Bucket Setup & Access Key Management

Sección titulada «Cloudflare R2 — Bucket Setup & Access Key Management»
ItemStatusDetail
urban-transparency-raw✅ ProvisionedPrivate bucket, WNAM region
urban-transparency-processed✅ Provisioned + Publicr2.dev URL active (see below)
Custom domain (data.civiscopio.com)📋 BacklogDeferred — requires civiscopio.com zone in Cloudflare
R2 API keys (pipeline)✅ N/APipeline uses CF_API_TOKEN directly (no S3 creds needed)
GitHub Actions Secrets⏳ NeededCF_ACCOUNT_ID, CF_API_TOKEN, INEGI_API_TOKEN (see Step 2)

Public URL (current):

https://pub-892f495399ba478cbe1375809c9e3cdc.r2.dev

Rate-limited and not recommended for high-traffic production. Connect a custom domain when ready (see backlog).


Two R2 buckets serve the Urban Transparency Platform pipeline:

BucketPurposePublic Access
urban-transparency-rawRaw ingested files (CSV, JSON, PDF, GeoJSON)Private
urban-transparency-processedAnalysis-ready files served to site (Parquet, GeoJSON, chart JSON)Public via r2.dev (custom domain TBD)

Both buckets were provisioned by Terraform. Public access on urban-transparency-processed is enabled with the r2.dev URL above.


The pipeline now uses CF_ACCOUNT_ID + CF_API_TOKEN directly (the same token you already have). No separate R2 API token is needed — the ingest scripts were rewritten to use the Cloudflare REST API instead of the S3-compatible API.

Go to the repo → Settings → Secrets and variables → Actions → New repository secret. Add:

Secret NameValue
CF_ACCOUNT_IDYour Cloudflare Account ID (24545b09d114aa250c46b2703991cd7e)
CF_API_TOKENYour CLOUDFLARE_API_TOKEN — the same one used for Terraform
INEGI_API_TOKENFrom INEGI developer portal

CF_ACCOUNT_ID and CF_API_TOKEN may already be set from the Cloudflare Pages deploy workflow. If so, you only need to add INEGI_API_TOKEN.


urban-transparency-raw/
├── inegi/
│ ├── indicators/{YYYY-MM-DD}/amm_municipalities.csv
│ └── denue/{YYYY-MM-DD}/amm_economic_units.json
├── conagua/
│ └── climate/{YYYY-MM-DD}/monterrey_climate.csv
├── osm/
│ └── boundaries/{YYYY-MM-DD}/amm_boundaries.geojson
├── scraped/{source-name}/{YYYY-MM-DD}/*.csv|*.json|*.pdf
└── _metadata/last_run.json

urban-transparency-processed (public via r2.dev / future custom domain)

Sección titulada «urban-transparency-processed (public via r2.dev / future custom domain)»
urban-transparency-processed/
├── parquet/
│ ├── inegi/{YYYY-MM-DD}/amm_census.parquet
│ └── conagua/{YYYY-MM-DD}/amm_climate.parquet
├── geojson/
│ ├── boundaries/amm_colonias.geojson
│ ├── boundaries/amm_municipalities.geojson
│ └── layers/heat_stress.geojson
├── charts/{article-slug}/*.json
└── _metadata/last_run.json

Public URL pattern (current):

https://pub-892f495399ba478cbe1375809c9e3cdc.r2.dev/geojson/boundaries/amm_colonias.geojson

  • Bucket names: lowercase, hyphenated, prefixed urban-transparency-
  • Date partitions: ISO 8601 YYYY-MM-DD
  • Parquet/GeoJSON: descriptive snake_case
  • Chart JSON: kebab-case matching article slug
{
"timestamp": "2026-04-07T14:00:00Z",
"run_id": "github-run-12345",
"sources": {
"inegi": { "rows": 132, "status": "ok" },
"conagua": { "rows": 365, "status": "ok" }
}
}

Ventana de terminal
# Trigger data pipeline manually
# GitHub → Actions → Data Ingest → Run workflow
# Verify files landed in raw bucket
wrangler r2 object get urban-transparency-raw/_metadata/last_run.json --pipe
# Verify CDN public access (use r2.dev URL until custom domain is set up)
curl -I https://pub-892f495399ba478cbe1375809c9e3cdc.r2.dev/_metadata/last_run.json

When civiscopio.com is added as a zone in this Cloudflare account:

  1. R2 → urban-transparency-processed → Settings → Public Access → Connect Domain
  2. Enter data.civiscopio.com
  3. Update terraform/variables.tf public_domain default to data.civiscopio.com
  4. Update all references from the r2.dev URL to https://data.civiscopio.com