Environments and configuration (frontend + backend)
This document is the canonical frontend/backend wiring reference for how the backend at the repo root and the in-tree frontend under frontend/ are wired together across environments.
Shared-VPS ownership note:
- This document owns the frontend/backend wiring contract for HealthArchive.
- Shared host topology, ingress ownership, service inventory, and other cross-project VPS facts are canonical in
/home/jer/repos/vps/platform-ops. - Use
/home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.mdas the default boundary reference when deciding where a VPS fact belongs.
The root ENVIRONMENTS.md is a pointer to this file to avoid duplication.
It is useful when:
- Setting or auditing environment variables on the VPS-hosted production stack.
- Double-checking that frontend hosts, backend hosts, and backend CORS settings line up.
Shared VPS inventory, ingress ownership, canonical public hosts, and cross-project operations state live in /home/jer/repos/vps/platform-ops. Use /home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.md as the default rule for what belongs in this repo versus shared ops documentation.
For deeper operational details, see:
production-single-vps.md(current production runbook)hosting-and-live-server-to-dos.md(high-level deployment checklist)../operations/monitoring-and-ci-checklist.md(uptime/monitoring guidance)../operations/baseline-drift.md(production drift checks: policy vs observed)- Frontend docs: https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/implementation-guide.md
- Frontend verification: https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/deployment/verification.md
1) Environments at a glance
What exists today
- Single backend API:
https://api.healtharchive.ca - No separate staging backend (by design)
- Backend CORS allowlist is intentionally strict:
https://healtharchive.cahttps://www.healtharchive.cahttps://replay.healtharchive.ca(for the optional replay banner and direct replay UX)
1.1 Validate production wiring (recommended)
On the production VPS, run the baseline drift check in live mode:
This validates:
- env vars are set as expected (including CORS allowlist),
- HSTS is configured and observed,
- admin endpoints are protected,
- CORS headers are actually returned for the allowed origins.
Matrix
| Environment | Frontend (browser origin) | Backend API base | Notes |
|---|---|---|---|
| Local dev | http://localhost:3000 | http://127.0.0.1:8001 | Local dev flow. |
| Production | https://healtharchive.ca / https://www.healtharchive.ca | https://api.healtharchive.ca | Current public site on the VPS. |
| Historical preview (retired) | https://healtharchive.vercel.app | https://api.healtharchive.ca | Historical only; not part of the current deployment model. |
Optional future:
| Environment | Frontend (browser origin) | Backend API base | Notes |
|---|---|---|---|
| Staging API (optional) | Preview URLs or a dedicated staging frontend | https://api-staging.healtharchive.ca | Only if you decide you want a separate staging backend later. |
2) Backend configuration (healtharchive)
All backend env vars are read by:
src/ha_backend/config.pysrc/ha_backend/api/deps.py- Search ranking selection is controlled by
HA_SEARCH_RANKING_VERSION(and can be overridden per-request withranking=v1|v2on/api/search).
2.1 Local development (typical)
Example shell setup (or via .env.example → .env, git-ignored):
export HEALTHARCHIVE_ENV=development
export HEALTHARCHIVE_DATABASE_URL=sqlite:///$(pwd)/.dev-healtharchive.db
export HEALTHARCHIVE_ARCHIVE_ROOT=$(pwd)/.dev-archive-root
export HEALTHARCHIVE_ZIMIT_DOCKER_IMAGE=ghcr.io/openzim/zimit # optional override (pin by tag or digest)
export HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGE=mcr.microsoft.com/playwright:v1.50.1-jammy # optional fallback-browser override
export HEALTHARCHIVE_ADMIN_TOKEN=localdev-admin
export HEALTHARCHIVE_LOG_LEVEL=DEBUG
export HA_SEARCH_RANKING_VERSION=v2
export HA_PAGES_FASTPATH=1
export HEALTHARCHIVE_REPLAY_BASE_URL=http://127.0.0.1:8090
export HEALTHARCHIVE_REPLAY_PREVIEW_DIR=$(pwd)/.dev-replay-previews
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000
2.2 Production (current)
On the production backend host (systemd env file / Docker env / PaaS env):
export HEALTHARCHIVE_ENV=production
export HEALTHARCHIVE_DATABASE_URL=postgresql+psycopg://healtharchive:<DB_PASSWORD>@127.0.0.1:5432/healtharchive
export HEALTHARCHIVE_ARCHIVE_ROOT=/srv/healtharchive/jobs
export HEALTHARCHIVE_ZIMIT_DOCKER_IMAGE=ghcr.io/openzim/zimit@sha256:<PINNED_DIGEST>
export HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGE=mcr.microsoft.com/playwright:v1.50.1-jammy
export HEALTHARCHIVE_ADMIN_TOKEN=<LONG_RANDOM_SECRET>
export HEALTHARCHIVE_CORS_ORIGINS=https://healtharchive.ca,https://www.healtharchive.ca,https://replay.healtharchive.ca
export HEALTHARCHIVE_LOG_LEVEL=INFO
export HA_SEARCH_RANKING_VERSION=v2
export HA_PAGES_FASTPATH=1
export HEALTHARCHIVE_USAGE_METRICS_ENABLED=1
export HEALTHARCHIVE_USAGE_METRICS_WINDOW_DAYS=30
export HEALTHARCHIVE_CHANGE_TRACKING_ENABLED=1
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000
export HEALTHARCHIVE_PUBLIC_SITE_URL=https://healtharchive.ca
export HEALTHARCHIVE_REPLAY_BASE_URL=https://replay.healtharchive.ca
export HEALTHARCHIVE_REPLAY_PREVIEW_DIR=/srv/healtharchive/replay/previews
Notes:
HEALTHARCHIVE_ADMIN_TOKENshould be a long random secret stored in a secret manager (e.g., Bitwarden + server env), never committed.HEALTHARCHIVE_ZIMIT_DOCKER_IMAGEpins the crawler container image used byarchive_tool. Use a digest (...@sha256:...) in production to avoid upstreamlatestchanges breaking crawls.HEALTHARCHIVE_PLAYWRIGHT_DOCKER_IMAGEpins the Chromium fallback image used byplaywright_warc. Keep it pinned to an explicit Playwright tag so browser behavior is reproducible across annual reruns.HEALTHARCHIVE_PLAYWRIGHT_NAVIGATION_TIMEOUT_MS(default150000) sets the per-page browser navigation timeout forplaywright_warc.HEALTHARCHIVE_PLAYWRIGHT_SETTLE_MS(default5000) adds a fixed post-load settle delay before capture/link extraction.HEALTHARCHIVE_PLAYWRIGHT_VIEWPORT_WIDTH/HEALTHARCHIVE_PLAYWRIGHT_VIEWPORT_HEIGHT(defaults1440x900) define the deterministic browser viewport.HEALTHARCHIVE_PLAYWRIGHT_LOCALE(defaulten-CA) andHEALTHARCHIVE_PLAYWRIGHT_TIMEZONE(defaultAmerica/Toronto) keep the fallback browser runtime stable across runs.HEALTHARCHIVE_PLAYWRIGHT_NODE_CACHE_DIR(default/tmp/healtharchive-playwright-node) controls the host-side cache/work directory mounted into the Playwright container for npm dependencies.HEALTHARCHIVE_REPLAY_BASE_URLenablesbrowseUrlfields in/api/searchand/api/snapshot/{id}so the frontend can embed the replay service.HEALTHARCHIVE_USAGE_METRICS_ENABLEDcontrols whether aggregated daily usage counts are recorded; disable it for a metrics-free deployment.HEALTHARCHIVE_CHANGE_TRACKING_ENABLEDcontrols whether change tracking endpoints/diff feeds are active (disable if you are not running the pipeline).- Compare-live controls (public snapshot vs live diffs):
HEALTHARCHIVE_COMPARE_LIVE_ENABLED(default1).HEALTHARCHIVE_COMPARE_LIVE_TIMEOUT_SECONDS(default8).HEALTHARCHIVE_COMPARE_LIVE_MAX_REDIRECTS(default4).HEALTHARCHIVE_COMPARE_LIVE_MAX_BYTES(default2000000).HEALTHARCHIVE_COMPARE_LIVE_MAX_ARCHIVE_BYTES(default2000000).HEALTHARCHIVE_COMPARE_LIVE_MAX_RENDER_LINES(default5000).HEALTHARCHIVE_COMPARE_LIVE_MAX_CONCURRENCY(default4).HEALTHARCHIVE_COMPARE_LIVE_USER_AGENT(default identifies HealthArchive).- Indexing integrity (optional, Phase 4 safety rail):
HEALTHARCHIVE_INDEX_WARC_VERIFY_LEVEL(default0; allowed:0|1|2).HEALTHARCHIVE_INDEX_WARC_VERIFY_MAX_DECOMPRESSED_BYTES(default unset; bounds Level 1 gzip checks per file).HEALTHARCHIVE_INDEX_WARC_VERIFY_MAX_RECORDS(default unset; bounds Level 2 WARC iteration per file).HEALTHARCHIVE_PUBLIC_SITE_URLsets the public base URL used in RSS links.- In
production(andstaging), if the admin token is missing, admin/metrics endpoints fail closed (HTTP 500) instead of being left open. HEALTHARCHIVE_CORS_ORIGINSshould be kept as narrow as possible; it controls which browser origins can call public API routes.- If you use the optional replay banner / direct replay UX, the replay origin must also be allowed by CORS so the banner can call
/api/replay/resolve.
2.3 Optional: staging backend (future)
If you later add a separate staging backend, it should generally mirror production except for DB/archive root and CORS origins:
export HEALTHARCHIVE_ENV=staging
export HEALTHARCHIVE_DATABASE_URL=postgresql+psycopg://healtharchive:<DB_PASSWORD>@127.0.0.1:5432/healtharchive_staging
export HEALTHARCHIVE_ARCHIVE_ROOT=/srv/healtharchive/jobs-staging
export HEALTHARCHIVE_ADMIN_TOKEN=<LONG_RANDOM_SECRET>
export HEALTHARCHIVE_CORS_ORIGINS=https://healtharchive-staging.example.com
export HEALTHARCHIVE_LOG_LEVEL=INFO
export HEALTHARCHIVE_USAGE_METRICS_ENABLED=1
export HEALTHARCHIVE_USAGE_METRICS_WINDOW_DAYS=30
export HEALTHARCHIVE_CHANGE_TRACKING_ENABLED=1
export HEALTHARCHIVE_EXPORTS_ENABLED=1
export HEALTHARCHIVE_EXPORTS_DEFAULT_LIMIT=1000
export HEALTHARCHIVE_EXPORTS_MAX_LIMIT=10000
export HEALTHARCHIVE_PUBLIC_SITE_URL=https://healtharchive.ca
3) Frontend configuration (frontend/)
The frontend reads env vars at build time.
3.1 Local development
Frontend frontend/.env.local (git-ignored):
NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8001
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true
3.2 Production VPS frontend env
In the frontend runtime env file on the VPS:
NEXT_PUBLIC_API_BASE_URL=https://api.healtharchive.ca
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=false
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=false
NEXT_PUBLIC_SHOW_API_BASE_HINT=false
3.3 Optional historical preview env (legacy only)
Only use this if you intentionally recreate a preview path later. It is not part of the current canonical deploy model:
NEXT_PUBLIC_API_BASE_URL=https://api.healtharchive.ca
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true
Note:
4) Security notes (secrets + CORS)
- Never commit secrets:
- No real
HEALTHARCHIVE_ADMIN_TOKEN, DB passwords, Healthchecks URLs, etc. - Use placeholders in docs and store real values in Bitwarden + server env settings.
- CORS is a security control:
- Tight allowlists reduce accidental exposure of browser-accessible APIs.
- If you loosen CORS to include branch previews, do it deliberately and document the tradeoff.