HealthArchive Documentation
This documentation portal covers the HealthArchive app monorepo and links to the separate datasets documentation.
Shared VPS inventory, ingress ownership, canonical public hosts, and cross-project operations state live in /home/jer/repos/vps/platform-ops. Use /home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.md as the default rule for what belongs in this repo versus shared ops documentation.
Quick Start by Role
Choose your path:
- 👤 Operators: Start with Operations Overview → Operator Responsibilities
- 💻 Developers: Start with Development Guide → Live Testing
- 🔧 Deploying: Start with Production Runbook
- 📊 API consumers: Start with API Documentation
- 📚 Researchers: Start with Project Overview → Datasets
Key Resources
| Need | Documentation |
|---|---|
| Architecture overview | Architecture |
| Production deployment | Production Runbook |
| Local development setup | Dev Setup |
| Incident response | Incident Response |
| Search API | API Documentation |
| Monitoring setup | Monitoring |
Documentation Structure
This docs portal is built from docs/ in the app monorepo. Frontend-specific docs remain canonical under frontend/docs/ and are surfaced here through the docs/frontend/ bridge; datasets docs remain canonical in the separate datasets repo:
- Frontend bridge:
frontend/README.md - Datasets pointers:
datasets-external/README.md
Shared VPS facts that are not specific to the backend are canonical in:
/home/jer/repos/vps/platform-ops/home/jer/repos/vps/platform-ops/docs/standards/PLAT-009-shared-vps-documentation-boundary.md
Recommended reading order
- Project docs portal (monorepo + datasets navigation)
project.md- Architecture & implementation (how the code works)
architecture.md- Documentation guidelines (how docs stay sane)
documentation-guidelines.mddocumentation-process-audit.md(audit of doc processes; 2026-01-09)decisions/README.md(decision records for high-stakes choices)- Local development / live testing (how to run it locally)
development/live-testing.mddevelopment/dev-environment-setup.md(local setup + local vs VPS guidance)development/testing-guidelines.md(backend test expectations)- Deployment (how to run it on a server)
deployment/production-single-vps.md(current production runbook)deployment/systemd/README.md(systemd units: annual scheduler, crawl monitoring + auto-recovery, baseline drift, replay reconcile + smoke tests, change tracking, annual search verify, coverage guardrails, cleanup automation, worker priority)deployment/replay-service-pywb.md(pywb replay service for full-fidelity browsing)deployment/search-rollout.md(enable v2 search + rollback)deployment/pages-table-rollout.md(pages table backfill + browse fast path)deployment/hosting-and-live-server-to-dos.md(historical hosting notes + optional future staging ideas)deployment/environments-and-configuration.md(frontend/backend env vars + host matrix)deployment/production-rollout-checklist.md(generic production checklist)deployment/staging-rollout-checklist.md(optional future staging)- Operations (how to keep it healthy)
operations/README.md(index of ops docs)- Roadmaps and implementation plans
planning/README.mdroadmap-process.md(short pointer)
Notes
- No secrets live in this repo. Any token/password values shown in docs must be placeholders.
- The
archive_toolcrawler has its own internal documentation atsrc/archive_tool/docs/documentation.md.