Skip to content

Environments and configuration (frontend + backend)

This document is the canonical frontend/backend wiring reference for how the backend at the repo root and the in-tree frontend under frontend/ are wired together across environments.

Shared-VPS ownership note:

  1. This document owns the frontend/backend wiring contract for HealthArchive.
  2. Shared host topology, ingress ownership, service inventory, and other cross-project VPS facts are canonical in /home/jer/repos/vps/platform-ops.
  3. Use /home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.md as the default boundary reference when deciding where a VPS fact belongs.

The root ENVIRONMENTS.md is a pointer to this file to avoid duplication.

It is useful when:

  • Setting or auditing environment variables on the VPS-hosted production stack.
  • Double-checking that frontend hosts, backend hosts, and backend CORS settings line up.

Shared VPS inventory, ingress ownership, canonical public hosts, and cross-project operations state live in /home/jer/repos/vps/platform-ops. Use /home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.md as the default rule for what belongs in this repo versus shared ops documentation.

For deeper operational details, see:

  • production-single-vps.md (current production runbook)
  • hosting-and-live-server-to-dos.md (high-level deployment checklist)
  • ../operations/monitoring-and-ci-checklist.md (uptime/monitoring guidance)
  • ../operations/baseline-drift.md (production drift checks: policy vs observed)
  • Frontend docs: https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/implementation-guide.md
  • Frontend verification: https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/deployment/verification.md

1) Environments at a glance

What exists today

  • Single backend API: https://api.healtharchive.ca
  • No separate staging backend (by design)
  • Backend CORS allowlist is intentionally strict:
  • https://healtharchive.ca
  • https://www.healtharchive.ca
  • https://replay.healtharchive.ca (for the optional replay banner and direct replay UX)

On the production VPS, run the baseline drift check in live mode:

cd /opt/healtharchive
./scripts/check_baseline_drift.py --mode live

This validates:

  • env vars are set as expected (including CORS allowlist),
  • HSTS is configured and observed,
  • admin endpoints are protected,
  • CORS headers are actually returned for the allowed origins.

Matrix

Environment Frontend (browser origin) Backend API base Notes
Local dev http://localhost:3000 http://127.0.0.1:8001 Local dev flow.
Production https://healtharchive.ca / https://www.healtharchive.ca https://api.healtharchive.ca Current public site on the VPS.
Historical preview (retired) https://healtharchive.vercel.app https://api.healtharchive.ca Historical only; not part of the current deployment model.

Optional future:

Environment Frontend (browser origin) Backend API base Notes
Staging API (optional) Preview URLs or a dedicated staging frontend https://api-staging.healtharchive.ca Only if you decide you want a separate staging backend later.

2) Backend configuration (healtharchive)

All backend env vars are read by:

  • src/ha_backend/config.py
  • src/ha_backend/api/deps.py
  • Search ranking selection is controlled by HA_SEARCH_RANKING_VERSION (and can be overridden per-request with ranking=v1|v2 on /api/search).

2.1 Local development (typical)

Example shell setup (or via .env.example.env, git-ignored):

export HEALTHARCHIVE_ENV=development
export HEALTHARCHIVE_DATABASE_URL=sqlite:///$(pwd)/.dev-healtharchive.db
export HEALTHARCHIVE_ARCHIVE_ROOT=$(pwd)/.dev-archive-root
export HEALTHARCHIVE_ZIMIT_DOCKER_IMAGE=ghcr.io/openzim/zimit  # optional override (pin by tag or digest)
export HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGE=mcr.microsoft.com/playwright:v1.50.1-jammy  # optional fallback-browser override
export HEALTHARCHIVE_ADMIN_TOKEN=localdev-admin
export HEALTHARCHIVE_LOG_LEVEL=DEBUG
export HA_SEARCH_RANKING_VERSION=v2
export HA_PAGES_FASTPATH=1
export HEALTHARCHIVE_REPLAY_BASE_URL=http://127.0.0.1:8090
export HEALTHARCHIVE_REPLAY_PREVIEW_DIR=$(pwd)/.dev-replay-previews
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000

2.2 Production (current)

On the production backend host (systemd env file / Docker env / PaaS env):

export HEALTHARCHIVE_ENV=production
export HEALTHARCHIVE_DATABASE_URL=postgresql+psycopg://healtharchive:<DB_PASSWORD>@127.0.0.1:5432/healtharchive
export HEALTHARCHIVE_ARCHIVE_ROOT=/srv/healtharchive/jobs
export HEALTHARCHIVE_ZIMIT_DOCKER_IMAGE=ghcr.io/openzim/zimit@sha256:<PINNED_DIGEST>
export HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGE=mcr.microsoft.com/playwright:v1.50.1-jammy
export HEALTHARCHIVE_ADMIN_TOKEN=<LONG_RANDOM_SECRET>
export HEALTHARCHIVE_CORS_ORIGINS=https://healtharchive.ca,https://www.healtharchive.ca,https://replay.healtharchive.ca
export HEALTHARCHIVE_LOG_LEVEL=INFO
export HA_SEARCH_RANKING_VERSION=v2
export HA_PAGES_FASTPATH=1
export HEALTHARCHIVE_USAGE_METRICS_ENABLED=1
export HEALTHARCHIVE_USAGE_METRICS_WINDOW_DAYS=30
export HEALTHARCHIVE_CHANGE_TRACKING_ENABLED=1
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000
export HEALTHARCHIVE_PUBLIC_SITE_URL=https://healtharchive.ca
export HEALTHARCHIVE_REPLAY_BASE_URL=https://replay.healtharchive.ca
export HEALTHARCHIVE_REPLAY_PREVIEW_DIR=/srv/healtharchive/replay/previews

Notes:

  • HEALTHARCHIVE_ADMIN_TOKEN should be a long random secret stored in a secret manager (e.g., Bitwarden + server env), never committed.
  • HEALTHARCHIVE_ZIMIT_DOCKER_IMAGE pins the crawler container image used by archive_tool. Use a digest (...@sha256:...) in production to avoid upstream latest changes breaking crawls.
  • HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGE pins the Chromium fallback image used by playwright_warc. Keep it pinned to an explicit Playwright tag so browser behavior is reproducible across annual reruns.
  • HEALTHARCHIVE_PLAYWRIGHT_NAVIGATION_TIMEOUT_MS (default 150000) sets the per-page browser navigation timeout for playwright_warc.
  • HEALTHARCHIVE_PLAYWRIGHT_SETTLE_MS (default 5000) adds a fixed post-load settle delay before capture/link extraction.
  • HEALTHARCHIVE_PLAYWRIGHT_VIEWPORT_WIDTH / HEALTHARCHIVE_PLAYWRIGHT_VIEWPORT_HEIGHT (defaults 1440x900) define the deterministic browser viewport.
  • HEALTHARCHIVE_PLAYWRIGHT_LOCALE (default en-CA) and HEALTHARCHIVE_PLAYWRIGHT_TIMEZONE (default America/Toronto) keep the fallback browser runtime stable across runs.
  • HEALTHARCHIVE_PLAYWRIGHT_NODE_CACHE_DIR (default /tmp/healtharchive-playwright-node) controls the host-side cache/work directory mounted into the Playwright container for npm dependencies.
  • HEALTHARCHIVE_REPLAY_BASE_URL enables browseUrl fields in /api/search and /api/snapshot/{id} so the frontend can embed the replay service.
  • HEALTHARCHIVE_USAGE_METRICS_ENABLED controls whether aggregated daily usage counts are recorded; disable it for a metrics-free deployment.
  • HEALTHARCHIVE_CHANGE_TRACKING_ENABLED controls whether change tracking endpoints/diff feeds are active (disable if you are not running the pipeline).
  • Compare-live controls (public snapshot vs live diffs):
  • HEALTHARCHIVE_COMPARE_LIVE_ENABLED (default 1).
  • HEALTHARCHIVE_COMPARE_LIVE_TIMEOUT_SECONDS (default 8).
  • HEALTHARCHIVE_COMPARE_LIVE_MAX_REDIRECTS (default 4).
  • HEALTHARCHIVE_COMPARE_LIVE_MAX_BYTES (default 2000000).
  • HEALTHARCHIVE_COMPARE_LIVE_MAX_ARCHIVE_BYTES (default 2000000).
  • HEALTHARCHIVE_COMPARE_LIVE_MAX_RENDER_LINES (default 5000).
  • HEALTHARCHIVE_COMPARE_LIVE_MAX_CONCURRENCY (default 4).
  • HEALTHARCHIVE_COMPARE_LIVE_USER_AGENT (default identifies HealthArchive).
  • Indexing integrity (optional, Phase 4 safety rail):
  • HEALTHARCHIVE_INDEX_WARC_VERIFY_LEVEL (default 0; allowed: 0|1|2).
  • HEALTHARCHIVE_INDEX_WARC_VERIFY_MAX_DECOMPRESSED_BYTES (default unset; bounds Level 1 gzip checks per file).
  • HEALTHARCHIVE_INDEX_WARC_VERIFY_MAX_RECORDS (default unset; bounds Level 2 WARC iteration per file).
  • HEALTHARCHIVE_PUBLIC_SITE_URL sets the public base URL used in RSS links.
  • In production (and staging), if the admin token is missing, admin/metrics endpoints fail closed (HTTP 500) instead of being left open.
  • HEALTHARCHIVE_CORS_ORIGINS should be kept as narrow as possible; it controls which browser origins can call public API routes.
  • If you use the optional replay banner / direct replay UX, the replay origin must also be allowed by CORS so the banner can call /api/replay/resolve.

2.3 Optional: staging backend (future)

If you later add a separate staging backend, it should generally mirror production except for DB/archive root and CORS origins:

export HEALTHARCHIVE_ENV=staging
export HEALTHARCHIVE_DATABASE_URL=postgresql+psycopg://healtharchive:<DB_PASSWORD>@127.0.0.1:5432/healtharchive_staging
export HEALTHARCHIVE_ARCHIVE_ROOT=/srv/healtharchive/jobs-staging
export HEALTHARCHIVE_ADMIN_TOKEN=<LONG_RANDOM_SECRET>
export HEALTHARCHIVE_CORS_ORIGINS=https://healtharchive-staging.example.com
export HEALTHARCHIVE_LOG_LEVEL=INFO
export HEALTHARCHIVE_USAGE_METRICS_ENABLED=1
export HEALTHARCHIVE_USAGE_METRICS_WINDOW_DAYS=30
export HEALTHARCHIVE_CHANGE_TRACKING_ENABLED=1
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000
export HEALTHARCHIVE_PUBLIC_SITE_URL=https://healtharchive.ca

3) Frontend configuration (frontend/)

The frontend reads env vars at build time.

3.1 Local development

Frontend frontend/.env.local (git-ignored):

NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8001
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true

3.2 Production VPS frontend env

In the frontend runtime env file on the VPS:

NEXT_PUBLIC_API_BASE_URL=https://api.healtharchive.ca
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=false
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=false
NEXT_PUBLIC_SHOW_API_BASE_HINT=false

3.3 Optional historical preview env (legacy only)

Only use this if you intentionally recreate a preview path later. It is not part of the current canonical deploy model:

NEXT_PUBLIC_API_BASE_URL=https://api.healtharchive.ca
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true

Note:


4) Security notes (secrets + CORS)

  • Never commit secrets:
  • No real HEALTHARCHIVE_ADMIN_TOKEN, DB passwords, Healthchecks URLs, etc.
  • Use placeholders in docs and store real values in Bitwarden + server env settings.
  • CORS is a security control:
  • Tight allowlists reduce accidental exposure of browser-accessible APIs.
  • If you loosen CORS to include branch previews, do it deliberately and document the tradeoff.