Scraper Scheduling & Operations¶
Last Updated: 2026-03-21 Status: ✅ All 4 provincial scrapers operational
Live scheduler status (updated March 13, 2026): scraper cadence is
hourlyon GitHub Actions with heartbeat stale threshold120minutes. The VPS backend path remains deferred because Ontario times out from that host.Migration note (March 13, 2026): this document still describes the current live GitHub Actions scheduler path. A same-host VPS worker attempt was paused after the Ontario source timed out from that host; see
docs/operations/direct-vps-backend.md.Reliability addendum (March 21, 2026): the live GitHub Actions Ontario scraper path now retries a read timeout once with an extended HTTP read timeout before surfacing a fetch failure. This hardened repeated
upstream_unavailable/fetchincidents without changing the VPS backend deferment.
Overview¶
Wait Time Canada operates 4 provincial emergency department wait time scrapers running on GitHub Actions. This document describes the scheduling, monitoring, and operational procedures.
Active Scrapers¶
| Province | Source ID | Status | Schedule | Last Verified |
|---|---|---|---|---|
| Quebec | quebec-msss | ✅ Active | Hourly | 2026-03-13 |
| Ontario | ontario-health | ✅ Active | Hourly | 2026-03-13 |
| Alberta | alberta-ahs | ✅ Active | Hourly | 2026-03-13 |
| British Columbia | bc-phsa | ✅ Active | Hourly | 2026-03-13 |
Total Coverage: 390+ hospitals across 4 provinces
GitHub Actions Workflows¶
1. Scraper Cron (scraper-cron.yml)¶
Purpose: Run all scrapers on schedule Schedule: 0 * * * * (Hourly) Runtime: ~8-12 minutes (all 4 scrapers) Timeout: 20 minutes
Execution:
Features: - ✅ Runs all registered scrapers automatically - ✅ Playwright browsers installed for Alberta runtime requirements - ✅ Failure alerting via Pushover - ✅ Tolerates individual scraper failures (succeeds if ANY data collected) - ✅ Database connection via DATABASE_URL secret - ✅ Sentry error tracking configured
Manual Trigger: Available via GitHub Actions UI (workflow_dispatch)
2. Heartbeat Monitor (heartbeat-monitor.yml)¶
Purpose: Dead Man's Switch - verify scrapers are running Schedule: */30 * * * * (every 30 minutes) Max Heartbeat Age: 120 minutes
Execution:
Features: - ✅ Dynamically discovers all sources from database - ✅ Checks scraper_status table for last run timestamp - ✅ Alerts via Pushover if heartbeat > 120 minutes old - ✅ Alerts include failure classification (category/stage) for error states - ✅ Alerts if no heartbeat ever recorded for a source - ✅ Sends alerts only on incident state changes, with a single recovery notice when healthy again
Alert Conditions: - ⚠️ Heartbeat older than 120 minutes - 🚨 No heartbeat found for source
Scraper Details¶
Quebec (MSSS)¶
- Methodology: REGISTRATION → PHYSICIAN (ROLLING_AVG)
- Technology: BeautifulSoup (HTML parsing)
- Coverage: 120+ hospitals
- Update Frequency: Hourly
- Special Features: ✅ Stretcher occupancy data (M17/M18)
- Data Quality: 86% test coverage
Ontario (Health Quality Ontario)¶
- Methodology: TRIAGE → PHYSICIAN (MEAN)
- Technology: Direct HTTP fetch + HTML table parsing
- Coverage: 220+ hospitals
- Update Frequency: Hourly
- Reliability Hardening: read timeouts retry once with an extended HTTP read timeout before failing
Alberta (AHS)¶
- Methodology: TRIAGE → PHYSICIAN (POINT_ESTIMATE)
- Technology: Playwright (JavaScript rendering required)
- Coverage: 26 hospitals
- Update Frequency: Hourly
- Browser: Chromium (installed in GitHub Actions)
British Columbia (PHSA)¶
- Methodology: TRIAGE → PHYSICIAN (P90)
- Technology: BeautifulSoup + JSON extraction (
__NEXT_DATA__) - Coverage: 25 hospitals
- Update Frequency: Hourly
- URL:
https://edwaittimes.ca/legacy
Database Schema¶
Sources Table¶
All scrapers reference entries in the sources table:
SELECT id, name, province FROM sources WHERE id IN (
'quebec-msss', 'ontario-health', 'alberta-ahs', 'bc-phsa'
);
Seeded via: migrations/004_seed_sources.sql, then corrected to the current canonical source definitions by migrations/020_sync_active_source_definitions.sql
Scraper Status Table¶
Heartbeat tracking in scraper_status:
Updated by: Each scraper run (success or failure) Monitored by: heartbeat-monitor.yml workflow
Scraper Alert State Table¶
Incident deduplication state is tracked separately in scraper_alert_state:
SELECT source_id, active_incident_kind, opened_at, last_resolved_at
FROM scraper_alert_state
ORDER BY source_id;
Updated by: check_heartbeat when incident state changes Purpose: Suppress duplicate stale/error notifications until the incident actually changes or resolves
CLI Commands¶
Run All Scrapers¶
Run Single Scraper¶
List Available Scrapers¶
Check Heartbeat Health¶
Check Detailed Operational Status (last-known-good + last-error)¶
Dry Run (No Database Writes)¶
Alerting¶
Pushover Configuration¶
Secrets Required: - PUSHOVER_USER_KEY - Your Pushover user key - PUSHOVER_API_TOKEN - Your Pushover API token
Alert Types: 1. Scraper Failure (scraper-cron.yml) - Title: 🚨 Scraper Error: <source-id> - Trigger: Source has status error and consecutive failures >= threshold - Payload: Includes failure_category/failure_stage classification - Priority: 1 (High)
- Stale Heartbeat (heartbeat-monitor.yml)
- Title: ⚠️ Scraper Stale
- Trigger: No heartbeat in last 120 minutes
-
Priority: 1 (High)
-
Recovery (heartbeat-monitor.yml)
- Title: ✅ Scraper Recovered:
<source-id> - Trigger: Source returns to healthy after an active stale/error incident
- Priority: 0 (Normal)
Deduplication behavior: - One incident alert when a source first becomes stale or error - No repeated alerts while the same incident fingerprint remains active - One recovery alert when the source returns to healthy
Manual Alert Test:
Monitoring Dashboard¶
GitHub Actions¶
View workflow runs: Actions Tab
Key Metrics¶
- Scraper Success Rate: Check
scraper-cronworkflow runs - Data Freshness: Query
MAX(timestamp_utc)frommeasurementsper source - Error Rate: Count failures in
scraper_statustable
SQL Queries¶
Data Freshness per Province:
SELECT
s.id,
s.province,
MAX(m.timestamp_utc) AS last_measurement,
EXTRACT(EPOCH FROM (NOW() - MAX(m.timestamp_utc)))/60 AS minutes_ago
FROM sources s
LEFT JOIN measurements m ON m.source_id = s.id
WHERE s.id IN ('quebec-msss', 'ontario-health', 'alberta-ahs', 'bc-phsa')
GROUP BY s.id, s.province
ORDER BY minutes_ago ASC;
Measurements per Source (Last 24h):
SELECT
source_id,
COUNT(*) as measurement_count,
COUNT(DISTINCT hospital_id) as hospital_count
FROM measurements
WHERE timestamp_utc > NOW() - INTERVAL '24 hours'
GROUP BY source_id
ORDER BY source_id;
Troubleshooting¶
Scraper Failing in GitHub Actions¶
- Check Workflow Logs:
- Go to Actions → Scraper Cron Job → Latest Run
-
Review step-by-step output
-
Common Issues:
- Database Connection: Verify
DATABASE_URLsecret is set - Playwright Timeout: Alberta may timeout if the page renders slowly
- HTTP Read Timeout: Ontario may still fail if the upstream remains slow even after the extended fallback timeout
-
HTML Structure Changed: Provincial websites may update their HTML
-
Test Locally:
No Heartbeat Alert¶
-
Check scraper_status Table:
-
Verify Source Exists:
-
Check GitHub Actions Runs:
- Ensure scraper-cron is running on the temporary 30m/60m cadence
- Check for workflow errors
Low Measurement Count¶
-
Verify Hospital Visibility:
-
Check Recent Errors:
Deployment Checklist¶
When adding a new scraper:
- Implement scraper class extending
BaseScraper - Add to
SCRAPERSregistry inscraper.py - Create source factory function
- Add source to
migrations/004_seed_sources.sql - Run migration or insert source manually
- Write unit tests (minimum 10 tests)
- Document methodology in
docs/methodologies/ - Seed hospital data in
backend/seed_data/hospitals/<province>.json - Test locally with
--dry-run - Verify in GitHub Actions (manual trigger)
- Monitor heartbeat for 24 hours
Performance Targets¶
| Metric | Target | Current |
|---|---|---|
| Scraper Run Frequency | Hourly | ✅ Configured |
| Max Scraper Runtime | < 15 min | ✅ ~8-12 min |
| Heartbeat Check Frequency | Every 30 min | ✅ Configured |
| Max Heartbeat Age | < 120 min | ✅ Monitored |
| Scraper Success Rate | > 95% | ✅ Tolerant design |
| Data Freshness | < 120 min | ✅ Hourly scheduler path |
Cost Analysis¶
GitHub Actions Minutes¶
- Scraper Cron: 12 min × 96 runs/day = 1,152 min/day = ~34,560 min/month
- Heartbeat Monitor: 2 min × 48 runs/day = 96 min/day = ~2,880 min/month
- Total: ~37,440 min/month
Free Tier: 2,000 minutes/month Status: ⚠️ Exceeds free tier by ~35,440 minutes/month
Cost Estimate: \(0.008/min × 35,440 = ~\)283.52/month
Optimization Options: 1. Reduce scraper frequency to 30 minutes (save 50%) 2. Use self-hosted runner (free, but requires infrastructure) 3. Optimize scraper runtime (currently ~10 min average)
Neon Public Transfer Guardrails¶
If Neon sends a public transfer warning (for example 80% usage), apply this runbook immediately:
- Confirm write volume is within expected range:
- If cost pressure returns, adjust cadence and threshold together:
scraper-cron.yml:0 * * * *-> slower cadenceheartbeat-monitor.yml: increase--max-ageto preserve sane alerting- Confirm connection reuse is active in
DatabaseService(constructor acceptsconn). - Keep read-heavy API routes cached at 5-10 minute shared TTL, and
no-storeonly for user-specific/export routes. - Confirm the live VPS frontend is on a release that includes
frontend/utils/server-cache.tsguardrails before changing scraper cadence or database policy.
Notes: - The scraper anomaly pipeline now computes baseline stats in SQL (count/mean/stddev/quartiles) to reduce transfer from Neon to scraper workers. - The shared VPS frontend now serves repeated anonymous reads for key public routes from a short-lived in-process cache before re-querying Neon. - Production is now on Neon Launch; use docs/operations/neon-production-upgrade.md for the recorded billing posture and post-upgrade monitoring guidance, and do not treat the old free tier as a viable steady-state production target just because these guardrails exist.
Future Enhancements¶
Planned¶
- Add Nova Scotia scraper
- Add New Brunswick scraper
- Implement smart scheduling (skip night hours for some provinces)
- Add Prometheus/Grafana monitoring
- Implement scraper performance metrics dashboard
Deferred¶
- Manitoba scraper (data source unclear)
- Saskatchewan scraper (no public data available)
References¶
- Scraper CLI:
backend/src/waittime/cli/scraper.py - Heartbeat monitor CLI:
backend/src/waittime/cli/check_heartbeat.py - Workflow catalog:
.github/workflows/README.md - Provincial methodologies:
backend/docs/methodologies/