Hosting & Live Server TODOs (Backend + Frontend)
Status: Historical checklist. This file documents the earlier Vercel-era wiring work. The implemented production model is the direct VPS + host Caddy path in
production-single-vps.md. Any references below to a separate frontend repo or Vercel preview wiring are retained only as historical context and should not be treated as the current deployment model.
This document tracks the remaining infrastructure / hosting steps needed to run HealthArchive.ca with a fully wired frontend + backend in production (and optionally add a staging environment later).
Nothing in here requires code changes – it is all environment configuration, DNS, and manual verification.
Note: The current production deployment is a single Hetzner VPS using Tailscale-only SSH, Caddy TLS, and nightly DB backups with an NAS pull. See
production-single-vps.mdfor the exact runbook that was implemented.
0. Quick index of “remote‑only” tasks
Use this as a map of everything that must be done outside your local dev environment (i.e., on live servers or in the GitHub UI). Historical Vercel notes are preserved below only so older rollout notes remain interpretable.
- On the backend server (production) – see §2 and §4:
- Provision a Postgres DB and set
HEALTHARCHIVE_DATABASE_URL. - Choose and provision storage for
HEALTHARCHIVE_ARCHIVE_ROOT. - Configure
HEALTHARCHIVE_ADMIN_TOKENandHEALTHARCHIVE_CORS_ORIGINS. - Reload/restart the backend service with the new env vars.
- Verify
/api/health,/api/sources,/api/search, and CORS headers over HTTPS. - Ensure HTTPS is enforced (HTTP→HTTPS redirect) and HSTS is enabled for
api.healtharchive.ca(andapi-staging.healtharchive.caonly if you later create a staging API). -
Configure DNS for
api.healtharchive.ca(and optionallyapi-staging.healtharchive.caif you later create a staging API) pointing at the backend. -
Historical frontend preview path – see §3 and §5:
- Keep this section for archival context only; do not recreate the old Vercel production/preview wiring as part of the current VPS deployment.
-
If a future preview environment is ever reintroduced, document it in a new runbook instead of reviving this retired path.
-
In GitHub for the monorepo – see §7:
- Commit and push the root CI workflows:
.github/workflows/backend-ci.yml,.github/workflows/frontend-ci.yml, and.github/workflows/production-smoke.yml. - Enable Actions in the GitHub UI if prompted.
- Configure branch protection on
mainto require the monorepo CI checks before merging.
You can tick off these high‑level items as you go, using the later sections for the exact commands and UI steps.
1. Decide canonical URLs (one‑time decision)
Before configuring env vars, confirm the URLs you want to use:
- Frontend – production
https://healtharchive.ca-
https://www.healtharchive.ca -
Frontend – preview (historical only)
- The old Vercel preview host was
https://healtharchive.vercel.app -
Current production does not depend on that path
-
Backend – production API (current choice: single API for everything)
-
https://api.healtharchive.ca(used by both Preview and Production frontends) -
Backend – staging API (optional future)
https://api-staging.healtharchive.ca(only if you later decide you want one)
Once you’re happy with those hostnames, the remaining steps in this document assume that naming. Substitute your actual choices as needed.
2. Backend configuration (CORS + env)
The backend already supports CORS and uses environment variables for its DB and archive root. Production/staging configuration is about setting the right env vars in the host environment and restarting the service.
2.1. Environment variables to set
On each backend deployment (systemd unit, Docker container, or PaaS app), configure the following environment variables on the remote host (not just in your local shell). Typical flow:
- SSH into the server or open your cloud provider’s “environment variables” UI for the backend app.
- Add/update the variables below.
-
Restart the backend service (see §2.2).
-
HEALTHARCHIVE_DATABASE_URL - Points at your production/staging DB (Postgres recommended).
-
Example:
-
HEALTHARCHIVE_ENV - High‑level environment hint used by admin auth.
- Recommended values:
development(or unset) for local dev.stagingfor staging hosts.productionfor production hosts.
-
When
HEALTHARCHIVE_ENVisstagingorproductionandHEALTHARCHIVE_ADMIN_TOKENis unset, admin and metrics endpoints fail closed with HTTP 500 instead of being left open. -
HEALTHARCHIVE_ARCHIVE_ROOT - Root directory where crawl jobs and WARCs will be written.
-
Must be on a filesystem with enough space and backups appropriate for your risk tolerance.
-
HEALTHARCHIVE_ADMIN_TOKEN - Token required for
/api/admin/*and/metricswhen set. -
Should be a strong random string, stored only in secure places (not committed to git).
-
HEALTHARCHIVE_CORS_ORIGINS - Critical for frontend integration.
- Comma‑separated list of frontend origins allowed to call the public API.
- When set, overrides the built‑in defaults.
Production example (frontend at healtharchive.ca):
Staging example (frontend at healtharchive.vercel.app):
Optional local dev access to prod/staging API:
export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.ca,https://www.healtharchive.ca,http://localhost:3000"
# or with staging:
export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.vercel.app,http://localhost:3000"
2.2. Apply config and restart services
How you do this depends on your hosting stack:
- systemd unit:
- Add env vars to the unit file (
Environment=lines) or a drop‑inEnvironmentFile=/etc/default/healtharchive. -
Reload + restart:
-
Docker / Docker Compose:
- Add env vars under
environment:in your compose file ordocker runcommand. -
Recreate containers:
-
PaaS (Render, Fly.io, Heroku, etc.):
- Use the provider’s UI/CLI to set env vars.
- Trigger a deployment or restart.
In staging and production you will typically run two backend processes:
- An API process (FastAPI + uvicorn) that serves
/api/**and/metrics. - A worker process (
healtharchive start-worker --poll-interval 30) that continuously processes queued jobs.
Both processes must see the same HEALTHARCHIVE_DATABASE_URL, HEALTHARCHIVE_ARCHIVE_ROOT, and related env vars from §2.1 so they share jobs and archive output consistently.
2.3. Backend smoke checks (staging/prod)
From a machine that can reach the backend host:
- API health
Check: - HTTP 200. - JSON body like:
- If you want the old summary payload, probe:-
CORS headers
-
Call the API with a fake
Originheader matching your frontend: -
Response should include:
-
Basic public routes
-
Verify:
-
Expect HTTP 200, JSON bodies, and CORS headers.
-
Security headers
-
Confirm that security-related headers are present on responses:
-
Look for:
X-Content-Type-Options: nosniffReferrer-Policy: strict-origin-when-cross-originX-Frame-Options: SAMEORIGINPermissions-Policy: geolocation=(), microphone=(), camera=()
2.4. Archive storage & retention
The HEALTHARCHIVE_ARCHIVE_ROOT directory is where crawl jobs and WARCs live. In staging and production you should treat it as persistent, non‑ephemeral storage and have a basic retention plan.
Checklist for each non‑dev environment:
- Place
HEALTHARCHIVE_ARCHIVE_ROOTon a filesystem that: - Is not ephemeral (survives VM/container restarts).
- Has enough capacity for expected WARCs and logs.
- Has a backup or snapshot policy appropriate for your risk tolerance.
- Decide whether this path is:
- Backed up regularly (if you want WARCs as part of a disaster‑recovery story), or
- Treated as “best‑effort cache” (if you rely on ZIMs/exports or other secondary storage).
- Decide when it is safe to delete temporary crawl artifacts:
- Only once jobs are
indexedorindex_failedand you have verified any desired ZIMs/exports. - Use the
healtharchive cleanup-job --id JOB_ID --mode tempcommand for this cleanup; it removes.tmp*directories and.archive_state.jsonbut leaves the main job directory and any final ZIMs. - If you are using replay (pywb) for a job, do not run
cleanup-job --mode tempfor that job — replay depends on the WARCs remaining on disk. - If replay is enabled globally (
HEALTHARCHIVE_REPLAY_BASE_URLis set),cleanup-job --mode tempwill refuse unless you pass--force. Treat--forceas an emergency override (it can break replay by deleting WARCs). - For larger deployments, consider:
- Keeping a simple inventory of jobs (via
/api/admin/jobsand metrics) so you know roughly how many indexed jobs you have and how bigjobs/is. - Periodically reviewing
cleanup_statusvia/metrics(healtharchive_jobs_cleanup_status_total{cleanup_status="temp_cleaned"}) to ensure temp artifacts are being pruned over time.
For local development it is sufficient to keep HEALTHARCHIVE_ARCHIVE_ROOT inside the repo (e.g. <./.dev-archive-root>) and delete it manually when you want a clean slate.
3. Frontend environment shape (historical preview notes + current local dev)
The Next.js app reads NEXT_PUBLIC_API_BASE_URL at build time and uses it for all backend requests. The old Vercel preview model also set the diagnostics flags below per environment, but that production/preview split is retired.
For the current deployment model, set the same variables in the VPS/frontend release environment described in production-single-vps.md and ../../frontend/README.md. Keep any future preview environment documented in a new runbook rather than reviving the retired Vercel workflow captured here.
3.1. Current local development env
In frontend/.env.local (not committed):
NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8001
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true
This is the template for local dev. The same variable shape applies in production, usually with diagnostics disabled unless you are actively debugging an integration problem.
4. DNS TODOs
Ensure DNS records are in place for the backend hosts.
4.1. Production API DNS
- In your DNS provider’s UI (e.g., Namecheap, Cloudflare, Route 53), locate the zone for
healtharchive.ca. - Create a record for
api.healtharchive.ca: - If the backend is on a VM with a fixed IP:
- Add an
Arecord (andAAAAfor IPv6 if applicable) pointing to the backend server IP.
- Add an
- If the backend is behind a load balancer or PaaS:
- Add a
CNAMEpointing at the provider hostname (e.g.,your-app.region.cloudprovider.com).
- Add a
4.2. Staging API DNS (optional)
- If you want a separate staging backend, create
api-staging.healtharchive.cain the same DNS zone: - Use an
A/AAAArecord (for a separate staging VM) or aCNAME(for a staging app/load balancer) pointing at the staging backend host.
After DNS is configured:
- Verify with:
- Then run the API health curl commands in §2.3 against the HTTPS URLs.
4.3. TLS / HTTPS and HSTS
- Terminate TLS (HTTPS) for
api.healtharchive.ca(andapi-staging.healtharchive.caif applicable) at your reverse proxy or load balancer: - Use Let's Encrypt or a managed certificate.
- Configure HTTP→HTTPS redirects for all HTTP traffic.
- Add an
Strict-Transport-Securityheader on HTTPS responses to enforce long-lived HTTPS in browsers. For example, in Nginx:
- After enabling HSTS, verify with:
5. End‑to‑end smoke checklist (staging/prod)
Once backend env vars, frontend release env vars, and DNS are in place:
5.1. From the frontend domain
On production (https://healtharchive.ca), and on any intentionally introduced non-production preview environment:
- Visit
/archive: - With backend up:
- Filters header should show
Filters (live API). - If the DB has snapshots, you’ll see real data (no demo fallback notice).
- Filters header should show
-
If the backend is unreachable:
- Filters header changes to
Filters (demo dataset fallback). - Demo records appear instead of live data.
- A small “Backend unreachable” banner may appear when diagnostics are enabled.
- Filters header changes to
-
Try filtering:
- Choose a source, e.g.
source=hc. - URL updates with
?source=hc. -
Results list changes accordingly (when live snapshots exist).
-
Navigate to
/archive/browse-by-source: - With backend up:
- Cards should show real record counts from
/api/sources.
- Cards should show real record counts from
-
If the backend is unreachable:
- “Backend unavailable” callout appears and demo summaries are shown.
-
Open a snapshot detail page
/snapshot/[id]: - For a real backend snapshot ID:
- Metadata (title, source, date, language, URL) is from
/api/snapshot/{id}. - “Open raw snapshot” ultimately points at
https://api…/api/snapshots/raw/{id}on the API host (the frontend prefixes therawSnapshotUrlpath from the API withNEXT_PUBLIC_API_BASE_URL).
- Metadata (title, source, date, language, URL) is from
- For a demo snapshot ID:
- Metadata comes from the bundled demo dataset, and the iframe points into
/demo-archive/**.
- Metadata comes from the bundled demo dataset, and the iframe points into
5.2. Console diagnostics (non-production when enabled)
On a non-production deployment, with diagnostics enabled:
- Open
/archiveand check the browser console: - You should see something like:
- If the base URL is wrong or the API is unreachable, the health banner and warning logs will make it obvious.
Production deployments typically keep diagnostics turned off, so you may not see these console logs even when everything is wired correctly.
This document should be revisited and checked off as each environment (local, production, and optional future staging) is brought fully online.
For a more detailed staging rollout, see:
staging-rollout-checklist.md
For a more detailed production rollout, see:
production-rollout-checklist.md
For a more detailed verification of CSP, headers, CORS, and the snapshot viewer iframe behavior, see:
- https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/deployment/verification.md
5.3. Monitoring & uptime checks (optional but recommended)
- Configure an external uptime monitor (e.g., UptimeRobot, healthchecks.io, or your cloud provider) to poll:
https://api.healtharchive.ca/api/health(backend health).https://healtharchive.ca/archive(frontend & backend integration).- Configure alerts (email/Slack/etc.) for repeated failures or slow responses.
- If you deploy Prometheus or a similar system, scrape
https://api.healtharchive.ca/metricsand build dashboards/alerts for: healtharchive_jobs_total{status="failed"}– job failures.healtharchive_snapshots_total– sudden jumps in snapshot count.
6. Admin / operator access TODOs
- Configure
HEALTHARCHIVE_ADMIN_TOKENin every non‑dev environment: - Set a long, random value via your hosting platform’s secret manager.
- Do not commit the token to the repo or to any checked‑in
.envfile. - Verify that
/api/admin/*and/metricsrequire the token: - Without headers: Expect
curl -i "https://api.healtharchive.ca/api/admin/jobs" curl -i "https://api.healtharchive.ca/metrics"403 Forbiddenwhen the token is configured. - With token: Expect
curl -i \ -H "Authorization: Bearer $HEALTHARCHIVE_ADMIN_TOKEN" \ "https://api.healtharchive.ca/api/admin/jobs" curl -i \ -H "Authorization: Bearer $HEALTHARCHIVE_ADMIN_TOKEN" \ "https://api.healtharchive.ca/metrics"200 OK. - Decide how operators will call admin APIs:
- Short‑term: direct
curl/CLI usage with the token exported in the shell. - Longer‑term (optional): a separate admin console (e.g.,
https://admin.healtharchive.ca) that runs in a trusted environment and never exposesHEALTHARCHIVE_ADMIN_TOKENto browser JavaScript. - If you later add an admin console:
- Protect it behind SSO, VPN, or other strong authentication.
- Avoid linking it from the public site navigation.
- Exclude admin URLs from search indexing (robots.txt and/or
<meta>tags).
7. GitHub Actions & branch protection TODOs
Continuous integration is now wired through the root workflow files in the monorepo and only becomes effective once you commit/push them and (optionally) protect branches.
7.1. Enable and verify GitHub Actions
For the canonical repo (healtharchive):
-
Ensure the root workflow files are present and enabled in the GitHub UI:
-
.github/workflows/backend-ci.yml .github/workflows/frontend-ci.yml.github/workflows/production-smoke.yml- Navigate to the repository on https://github.com.
- Click the Actions tab.
-
If GitHub shows a banner like “Workflows are disabled for this fork,” click Enable workflows.
-
Push a test commit or re-run the latest workflows to verify that runs are triggered for branch
mainand for pull requests: -
Backend CI should run the backend check surface.
- Frontend CI should install frontend deps, verify generated contracts, and run the frontend check surface.
- Production smoke should remain available for manual or scheduled end-to-end verification.
7.2. Configure branch protection (optional but recommended)
To prevent merging changes that break tests or linting:
- Open the repository page and go to Settings → Branches.
-
Under Branch protection rules, click Add rule (or edit an existing rule) and set:
-
Branch name pattern:
main - Enable Require a pull request before merging (tune review settings as you prefer).
- Enable Require status checks to pass before merging and select the CI workflows wired from the root
.github/workflows/directory, at minimumBackend CIandFrontend CI. -
Optionally enable Include administrators so even admin users must wait for green CI.
-
Click Create or Save changes to persist the rule.
After this, any PR targeting main will need green CI checks before it can be merged, ensuring that:
- Backend changes don’t break the backend verification surface.
- Frontend changes don’t break contract sync, linting, or frontend tests.