Database Architecture¶
Overview¶
- Engine: PostgreSQL (Neon)
- Migration source:
backend/migrations/ - Core strategy: strict ontology metadata + verifiable provenance + 30-day raw retention + permanent aggregate summaries
Schema Components¶
Ontology enums¶
Created in 001_create_enums.sql:
metric_family_enumstart_event_enumend_event_enumstatistic_type_enumpatient_scope_enumscraper_status_enum
Core operational tables¶
From 002_create_tables.sql:
sources: provincial source metadata + telehealth routing metadatahospitals: facility metadata with verification/visibility gatesmeasurements: raw wait measurements + ontology tags + payload hash/snippetscraper_status: heartbeat and scraper execution statusscraper_alert_state: current per-source incident state for stale/error deduplication and recovery tracking
Analytics and quality extensions¶
measurement_aggregates(006_create_measurement_aggregates.sql)data_quality_snapshots(007_create_data_quality_snapshots.sql)- anomaly flags on
measurements(008_add_anomaly_columns.sql) methodology_change_events(009_create_methodology_change_events.sql)regionsandhospital_regions(010_create_regions_tables.sql)- scraper observability metadata on
scraper_status(013_add_scraper_observability_columns.sql) - public-health-hub tables (
018_create_public_health_hub_tables.sql) - public-health-hub alert-state tracking (
019_add_public_health_source_alert_state.sql)
Scraper observability fields (M30)¶
scraper_status now persists both last-known-good and structured failure state:
- last success:
last_success_run,last_success_measurements_count - last error:
last_error_run,last_error_category,last_error_stage,error_message - reliability state:
consecutive_failures,last_run_duration_ms
These fields support deterministic triage in check_heartbeat --verbose, workflow alerts, and /api/health.
Alert state tracking¶
scraper_alert_state stores at most one active operational incident per source:
active_incident_kind(staleorerror)active_incident_fingerprintopened_atlast_notified_atlast_resolved_at
This lets the heartbeat monitor send one incident alert when a source becomes unhealthy and one recovery notice when it returns to healthy, instead of repeating the same alert every run.
Public health hub tables¶
The Ontario-first /resources module now uses five dedicated database tables:
public_data_sources: source metadata, provenance, legal posture, and last refresh timestampsresource_locations: normalizedfacilityandaedrows for public search/map usepublic_health_alerts: normalized Health Canada recall and safety alert itemspublic_health_system_metrics: analytics-only Ontario EMS system-context rows for/resources/system-contextpublic_health_source_alert_state: incident deduplication state for hard-fail public-health ingest sources
These tables are operationally separate from the wait-time observatory schema. AQHI and the ISC drinking-water advisory layer remain live-proxied and are intentionally not persisted in the database. The source catalog metadata exposed by /resources, /api/resources/alerts, and /api/resources/system-context is DB-backed; the live-proxied /api/resources/aqhi and /api/resources/water-advisories routes ship the equivalent source metadata contract directly from the route layer.
Data Model Rules¶
- Do not normalize incompatible methodologies into a single synthetic metric.
- Use ontology tags to drive comparability logic.
- Keep verification gate: new hospitals are not auto-published.
- Store payload hash/snippet, not full source HTML.
- Delete raw measurements older than 30 days via bounded cleanup runs; trigger aggregate refresh separately when needed.
Migration and Bootstrap¶
Primary commands:
bootstrap_analytics applies migrations (idempotent), seeds regional mappings, and backfills aggregates.
Query Patterns¶
- Map and list views pull latest measurement per hospital.
- Trends and benchmarks prefer aggregate tables (
measurement_aggregates) for daily/weekly/monthly views. - Intra-day patterns and bounded hourly exports derive from raw
measurementsfor fidelity. - Data quality and anomalies rely on quality snapshot + anomaly flags.
- Region analytics depend on
regions+hospital_regionsmapping coverage.
References¶
- API architecture:
docs/architecture/api.md - Public API contracts:
docs/API.md - Roadmap status:
docs/planning/roadmap.md