Skip to content

Replay service playbook (operators)

Goal: keep replay (replay.healtharchive.ca) available when the project relies on it.

Canonical references:

  • Replay runbook: ../../../deployment/replay-service-pywb.md
  • Production runbook: ../../../deployment/production-single-vps.md
  • Replay automation design: ../../replay-and-preview-automation-plan.md

Setup / recovery (if replay is missing)

Follow ../../../deployment/replay-service-pywb.md.

Verify replay is working

  1. Check the base URL is up:
  2. curl -I https://replay.healtharchive.ca/ | head
  3. Verify the public surface script can resolve a replay browseUrl for a known snapshot:
  4. cd /opt/healtharchive && ./scripts/verify_public_surface.py
  5. Verify the replay banner works on a direct replay page:
  6. Open a known browseUrl on https://replay.healtharchive.ca/ and confirm the banner loads quickly, shows the page title + meta line (capture date + original URL) + disclaimer, and that the action links (View diff, Details, All snapshots, Raw HTML, Metadata JSON, Cite, Report issue, Hide) behave as expected.
  7. From HealthArchive search results, click View and confirm “← HealthArchive.ca” returns to the same search results page.

If public replay 502s but localhost pywb is 200

Use this when replay.healtharchive.ca/... returns 502 but a direct localhost probe against pywb returns 200.

  1. Confirm the exact replay path locally:
  2. curl -sv --http1.1 "http://127.0.0.1:8090<PATH>" -o /dev/null | sed -n '1,40p'
  3. Check Caddy for upstream parser errors:
  4. sudo journalctl -u caddy -n 20 --no-pager
  5. If Caddy reports a malformed header line (for example an archived bare AWSALBCORS=... cookie continuation), treat it as replay-header sanitization, not replay indexing.
  6. Confirm the replay service is loading /srv/healtharchive/replay/sitecustomize.py via PYTHONPATH=/webarchive in healtharchive-replay.service.

Retention warning

Replay depends on WARCs staying on disk. Do not delete WARCs for jobs you expect to replay.

What “done” means

  • https://replay.healtharchive.ca/ responds successfully.
  • ./scripts/verify_public_surface.py reports a working replay browseUrl where expected.