Hosting & Live Server TODOs (Backend + Frontend)

Status: Historical checklist. This file documents the earlier Vercel-era wiring work. The implemented production model is the direct VPS + host Caddy path in production-single-vps.md. Any references below to a separate frontend repo or Vercel preview wiring are retained only as historical context and should not be treated as the current deployment model.

This document tracks the remaining infrastructure / hosting steps needed to run HealthArchive.ca with a fully wired frontend + backend in production (and optionally add a staging environment later).

Nothing in here requires code changes – it is all environment configuration, DNS, and manual verification.

Note: The current production deployment is a single Hetzner VPS using Tailscale-only SSH, Caddy TLS, and nightly DB backups with an NAS pull. See production-single-vps.md for the exact runbook that was implemented.

0. Quick index of “remote‑only” tasks

Use this as a map of everything that must be done outside your local dev environment (i.e., on live servers or in the GitHub UI). Historical Vercel notes are preserved below only so older rollout notes remain interpretable.

You can tick off these high‑level items as you go, using the later sections for the exact commands and UI steps.

1. Decide canonical URLs (one‑time decision)

Before configuring env vars, confirm the URLs you want to use:

Frontend – production
https://healtharchive.ca
https://www.healtharchive.ca
Frontend – preview (historical only)
The old Vercel preview host was https://healtharchive.vercel.app
Current production does not depend on that path
Backend – production API (current choice: single API for everything)
https://api.healtharchive.ca (used by both Preview and Production frontends)
Backend – staging API (optional future)
https://api-staging.healtharchive.ca (only if you later decide you want one)

Once you’re happy with those hostnames, the remaining steps in this document assume that naming. Substitute your actual choices as needed.

2. Backend configuration (CORS + env)

The backend already supports CORS and uses environment variables for its DB and archive root. Production/staging configuration is about setting the right env vars in the host environment and restarting the service.

2.1. Environment variables to set

On each backend deployment (systemd unit, Docker container, or PaaS app), configure the following environment variables on the remote host (not just in your local shell). Typical flow:

SSH into the server or open your cloud provider’s “environment variables” UI for the backend app.
Add/update the variables below.
Restart the backend service (see §2.2).
HEALTHARCHIVE_DATABASE_URL
Points at your production/staging DB (Postgres recommended).

Example:

export HEALTHARCHIVE_DATABASE_URL=postgresql+psycopg://user:pass@db-host:5432/healtharchive

HEALTHARCHIVE_ENV
High‑level environment hint used by admin auth.
Recommended values:
- development (or unset) for local dev.
- staging for staging hosts.
- production for production hosts.
When HEALTHARCHIVE_ENV is staging or production and HEALTHARCHIVE_ADMIN_TOKEN is unset, admin and metrics endpoints fail closed with HTTP 500 instead of being left open.
HEALTHARCHIVE_ARCHIVE_ROOT
Root directory where crawl jobs and WARCs will be written.
Must be on a filesystem with enough space and backups appropriate for your risk tolerance.
```
export HEALTHARCHIVE_ARCHIVE_ROOT=/srv/healtharchive/jobs
```
HEALTHARCHIVE_ADMIN_TOKEN
Token required for /api/admin/* and /metrics when set.
Should be a strong random string, stored only in secure places (not committed to git).
```
export HEALTHARCHIVE_ADMIN_TOKEN="some-long-random-string"
```
HEALTHARCHIVE_CORS_ORIGINS
Critical for frontend integration.
Comma‑separated list of frontend origins allowed to call the public API.
When set, overrides the built‑in defaults.

Production example (frontend at healtharchive.ca):

export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.ca,https://www.healtharchive.ca"

Staging example (frontend at healtharchive.vercel.app):

export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.vercel.app"

Optional local dev access to prod/staging API:

export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.ca,https://www.healtharchive.ca,http://localhost:3000"
# or with staging:
export HEALTHARCHIVE_CORS_ORIGINS="https://healtharchive.vercel.app,http://localhost:3000"

2.2. Apply config and restart services

How you do this depends on your hosting stack:

systemd unit:
Add env vars to the unit file (Environment= lines) or a drop‑in EnvironmentFile=/etc/default/healtharchive.

Reload + restart:

sudo systemctl daemon-reload
sudo systemctl restart healtharchive.service

Docker / Docker Compose:
Add env vars under environment: in your compose file or docker run command.

Recreate containers:

docker compose up -d --force-recreate backend

PaaS (Render, Fly.io, Heroku, etc.):
Use the provider’s UI/CLI to set env vars.
Trigger a deployment or restart.

In staging and production you will typically run two backend processes:

An API process (FastAPI + uvicorn) that serves /api/** and /metrics.
A worker process (healtharchive start-worker --poll-interval 30) that continuously processes queued jobs.

Both processes must see the same HEALTHARCHIVE_DATABASE_URL, HEALTHARCHIVE_ARCHIVE_ROOT, and related env vars from §2.1 so they share jobs and archive output consistently.

2.3. Backend smoke checks (staging/prod)

From a machine that can reach the backend host:

API health

curl -i "https://api.healtharchive.ca/api/health"

Check: - HTTP 200. - JSON body like:

{"status":"ok","checks":{"db":"ok"}}

- If you want the old summary payload, probe:

curl -i "https://api.healtharchive.ca/api/health?details=1"

CORS headers

Call the API with a fake Origin header matching your frontend:

curl -i \
  -H "Origin: https://healtharchive.ca" \
  "https://api.healtharchive.ca/api/health"

Response should include:

Access-Control-Allow-Origin: https://healtharchive.ca
Vary: Origin

Basic public routes

Verify:

curl -i "https://api.healtharchive.ca/api/sources"
curl -i "https://api.healtharchive.ca/api/search?page=1&pageSize=10"

Expect HTTP 200, JSON bodies, and CORS headers.
Security headers

Confirm that security-related headers are present on responses:

curl -i "https://api.healtharchive.ca/api/health" | sed -n '1,20p'

Look for:
- X-Content-Type-Options: nosniff
- Referrer-Policy: strict-origin-when-cross-origin
- X-Frame-Options: SAMEORIGIN
- Permissions-Policy: geolocation=(), microphone=(), camera=()

2.4. Archive storage & retention

The HEALTHARCHIVE_ARCHIVE_ROOT directory is where crawl jobs and WARCs live. In staging and production you should treat it as persistent, non‑ephemeral storage and have a basic retention plan.

Checklist for each non‑dev environment:

Place HEALTHARCHIVE_ARCHIVE_ROOT on a filesystem that:
Is not ephemeral (survives VM/container restarts).
Has enough capacity for expected WARCs and logs.
Has a backup or snapshot policy appropriate for your risk tolerance.
Decide whether this path is:
Backed up regularly (if you want WARCs as part of a disaster‑recovery story), or
Treated as “best‑effort cache” (if you rely on ZIMs/exports or other secondary storage).
Decide when it is safe to delete temporary crawl artifacts:
Only once jobs are indexed or index_failed and you have verified any desired ZIMs/exports.
Use the healtharchive cleanup-job --id JOB_ID --mode temp command for this cleanup; it removes .tmp* directories and .archive_state.json but leaves the main job directory and any final ZIMs.
If you are using replay (pywb) for a job, do not run cleanup-job --mode temp for that job — replay depends on the WARCs remaining on disk.
If replay is enabled globally (HEALTHARCHIVE_REPLAY_BASE_URL is set), cleanup-job --mode temp will refuse unless you pass --force. Treat --force as an emergency override (it can break replay by deleting WARCs).
For larger deployments, consider:
Keeping a simple inventory of jobs (via /api/admin/jobs and metrics) so you know roughly how many indexed jobs you have and how big jobs/ is.
Periodically reviewing cleanup_status via /metrics (healtharchive_jobs_cleanup_status_total{cleanup_status="temp_cleaned"}) to ensure temp artifacts are being pruned over time.

For local development it is sufficient to keep HEALTHARCHIVE_ARCHIVE_ROOT inside the repo (e.g. <./.dev-archive-root>) and delete it manually when you want a clean slate.

3. Frontend environment shape (historical preview notes + current local dev)

The Next.js app reads NEXT_PUBLIC_API_BASE_URL at build time and uses it for all backend requests. The old Vercel preview model also set the diagnostics flags below per environment, but that production/preview split is retired.

For the current deployment model, set the same variables in the VPS/frontend release environment described in production-single-vps.md and ../../frontend/README.md. Keep any future preview environment documented in a new runbook rather than reviving the retired Vercel workflow captured here.

3.1. Current local development env

In frontend/.env.local (not committed):

NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8001
NEXT_PUBLIC_SHOW_API_HEALTH_BANNER=true
NEXT_PUBLIC_LOG_API_HEALTH_FAILURE=true
NEXT_PUBLIC_SHOW_API_BASE_HINT=true

This is the template for local dev. The same variable shape applies in production, usually with diagnostics disabled unless you are actively debugging an integration problem.

4. DNS TODOs

Ensure DNS records are in place for the backend hosts.

4.1. Production API DNS

In your DNS provider’s UI (e.g., Namecheap, Cloudflare, Route 53), locate the zone for healtharchive.ca.
Create a record for api.healtharchive.ca:
If the backend is on a VM with a fixed IP:
- Add an A record (and AAAA for IPv6 if applicable) pointing to the backend server IP.
If the backend is behind a load balancer or PaaS:
- Add a CNAME pointing at the provider hostname (e.g., your-app.region.cloudprovider.com).

4.2. Staging API DNS (optional)

If you want a separate staging backend, create api-staging.healtharchive.ca in the same DNS zone:
Use an A/AAAA record (for a separate staging VM) or a CNAME (for a staging app/load balancer) pointing at the staging backend host.

After DNS is configured:

Verify with:

dig +short api.healtharchive.ca
dig +short api-staging.healtharchive.ca

Then run the API health curl commands in §2.3 against the HTTPS URLs.

4.3. TLS / HTTPS and HSTS

Terminate TLS (HTTPS) for api.healtharchive.ca (and api-staging.healtharchive.ca if applicable) at your reverse proxy or load balancer:
Use Let's Encrypt or a managed certificate.
Configure HTTP→HTTPS redirects for all HTTP traffic.
Add an Strict-Transport-Security header on HTTPS responses to enforce long-lived HTTPS in browsers. For example, in Nginx:

add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

After enabling HSTS, verify with:

curl -i "https://api.healtharchive.ca/api/health" | grep -i strict-transport-security

5. End‑to‑end smoke checklist (staging/prod)

Once backend env vars, frontend release env vars, and DNS are in place:

5.1. From the frontend domain

On production (https://healtharchive.ca), and on any intentionally introduced non-production preview environment:

Visit /archive:
With backend up:
- Filters header should show Filters (live API).
- If the DB has snapshots, you’ll see real data (no demo fallback notice).
If the backend is unreachable:
- Filters header changes to Filters (demo dataset fallback).
- Demo records appear instead of live data.
- A small “Backend unreachable” banner may appear when diagnostics are enabled.
Try filtering:
Choose a source, e.g. source=hc.
URL updates with ?source=hc.
Results list changes accordingly (when live snapshots exist).
Navigate to /archive/browse-by-source:
With backend up:
- Cards should show real record counts from /api/sources.
If the backend is unreachable:
- “Backend unavailable” callout appears and demo summaries are shown.
Open a snapshot detail page /snapshot/[id]:
For a real backend snapshot ID:
- Metadata (title, source, date, language, URL) is from /api/snapshot/{id}.
- “Open raw snapshot” ultimately points at https://api…/api/snapshots/raw/{id} on the API host (the frontend prefixes the rawSnapshotUrl path from the API with NEXT_PUBLIC_API_BASE_URL).
For a demo snapshot ID:
- Metadata comes from the bundled demo dataset, and the iframe points into /demo-archive/**.

5.2. Console diagnostics (non-production when enabled)

On a non-production deployment, with diagnostics enabled:

Open /archive and check the browser console:

You should see something like:

[healtharchive] API base URL (from NEXT_PUBLIC_API_BASE_URL or default): https://api.healtharchive.ca

If the base URL is wrong or the API is unreachable, the health banner and warning logs will make it obvious.

Production deployments typically keep diagnostics turned off, so you may not see these console logs even when everything is wired correctly.

This document should be revisited and checked off as each environment (local, production, and optional future staging) is brought fully online.

For a more detailed staging rollout, see:

staging-rollout-checklist.md

For a more detailed production rollout, see:

production-rollout-checklist.md

For a more detailed verification of CSP, headers, CORS, and the snapshot viewer iframe behavior, see:

https://github.com/jerdaw/healtharchive/blob/main/frontend/docs/deployment/verification.md

5.3. Monitoring & uptime checks (optional but recommended)

Configure an external uptime monitor (e.g., UptimeRobot, healthchecks.io, or your cloud provider) to poll:
https://api.healtharchive.ca/api/health (backend health).
https://healtharchive.ca/archive (frontend & backend integration).
Configure alerts (email/Slack/etc.) for repeated failures or slow responses.
If you deploy Prometheus or a similar system, scrape https://api.healtharchive.ca/metrics and build dashboards/alerts for:
healtharchive_jobs_total{status="failed"} – job failures.
healtharchive_snapshots_total – sudden jumps in snapshot count.

6. Admin / operator access TODOs

Configure HEALTHARCHIVE_ADMIN_TOKEN in every non‑dev environment:
Set a long, random value via your hosting platform’s secret manager.
Do not commit the token to the repo or to any checked‑in .env file.
Verify that /api/admin/* and /metrics require the token:

Without headers:

curl -i "https://api.healtharchive.ca/api/admin/jobs"
curl -i "https://api.healtharchive.ca/metrics"

Expect 403 Forbidden when the token is configured.

With token:

curl -i \
  -H "Authorization: Bearer $HEALTHARCHIVE_ADMIN_TOKEN" \
  "https://api.healtharchive.ca/api/admin/jobs"
curl -i \
  -H "Authorization: Bearer $HEALTHARCHIVE_ADMIN_TOKEN" \
  "https://api.healtharchive.ca/metrics"

Expect 200 OK.

Decide how operators will call admin APIs:
Short‑term: direct curl/CLI usage with the token exported in the shell.
Longer‑term (optional): a separate admin console (e.g., https://admin.healtharchive.ca) that runs in a trusted environment and never exposes HEALTHARCHIVE_ADMIN_TOKEN to browser JavaScript.
If you later add an admin console:
Protect it behind SSO, VPN, or other strong authentication.
Avoid linking it from the public site navigation.
Exclude admin URLs from search indexing (robots.txt and/or <meta> tags).

7. GitHub Actions & branch protection TODOs

Continuous integration is now wired through the root workflow files in the monorepo and only becomes effective once you commit/push them and (optionally) protect branches.

7.1. Enable and verify GitHub Actions

For the canonical repo (healtharchive):

Ensure the root workflow files are present and enabled in the GitHub UI:
.github/workflows/backend-ci.yml
.github/workflows/frontend-ci.yml
.github/workflows/production-smoke.yml
Navigate to the repository on https://github.com.
Click the Actions tab.
If GitHub shows a banner like “Workflows are disabled for this fork,” click Enable workflows.
Push a test commit or re-run the latest workflows to verify that runs are triggered for branch main and for pull requests:
Backend CI should run the backend check surface.
Frontend CI should install frontend deps, verify generated contracts, and run the frontend check surface.
Production smoke should remain available for manual or scheduled end-to-end verification.

7.2. Configure branch protection (optional but recommended)

To prevent merging changes that break tests or linting:

Open the repository page and go to Settings → Branches.
Under Branch protection rules, click Add rule (or edit an existing rule) and set:
Branch name pattern: main
Enable Require a pull request before merging (tune review settings as you prefer).
Enable Require status checks to pass before merging and select the CI workflows wired from the root .github/workflows/ directory, at minimum Backend CI and Frontend CI.
Optionally enable Include administrators so even admin users must wait for green CI.
Click Create or Save changes to persist the rule.

After this, any PR targeting main will need green CI checks before it can be merged, ensuring that:

Backend changes don’t break the backend verification surface.
Frontend changes don’t break contract sync, linting, or frontend tests.