2. Implement Strict Metric Ontology System¶

Date: 2024-12-26

Status: Accepted

Deciders: Project Team

Technical Story: Core architectural decision for data model

Context and Problem Statement¶

Canadian provinces report emergency room wait times using different methodologies: - Alberta: Triage-to-Physician (90^th percentile) - Quebec: Registration-to-Physician (Rolling average) - Manitoba: Algorithmic estimate (proprietary calculation)

Problem: Users will naturally assume these values are directly comparable and make inappropriate comparisons between provinces, leading to misinterpretation and potential poor healthcare decisions.

How do we store heterogeneous wait time data in a way that enables transparency about comparability without attempting to normalize incomparable metrics?

Decision Drivers¶

Data Integrity - Never misrepresent or artificially normalize provincial data
Scientific Rigor - Enable researchers to understand exact methodology
User Safety - Prevent misinterpretation that could lead to poor healthcare decisions
Portfolio Value - Demonstrate understanding of research methodology ("Scholar" narrative)
Maintainability - Clear schema that prevents future drift

Considered Options¶

Strict Metric Ontology - Tag every measurement with 4-field ontology (metric_family, start_event, end_event, statistic_type)
Normalization Approach - Convert all measurements to standardized metric
Free-Text Methodology - Store methodology as unstructured text
Province-Specific Tables - Separate tables per province

Decision Outcome¶

Chosen option: "Strict Metric Ontology", because: - Only option that maintains data integrity while enabling comparability analysis - Provides scientific rigor required for research use - Enables automatic generation of "divergence briefs" when data is incomparable - Demonstrates deep understanding of research methodology (key portfolio value) - Scales to new provinces without schema changes

Positive Consequences¶

Automatic Comparability Detection - Can programmatically determine if two measurements are comparable
Divergence Brief Generation - System automatically explains why data is incomparable
Database-Level Enforcement - PostgreSQL enums prevent invalid ontology values
Research-Grade Data - Researchers can query by exact methodology
Future-Proof - Adding new provinces/metrics doesn't require schema changes
Portfolio Differentiation - Showcases "Physician-Innovator" competency

Negative Consequences¶

Initial Complexity - Requires understanding of each province's methodology
Cannot Average Across Provinces - No "national average" wait time possible
Scraper Complexity - Each scraper must correctly tag ontology fields
UI Complexity - Must show methodology warnings prominently

Pros and Cons of the Options¶

Strict Metric Ontology¶

Example schema:

CREATE TABLE measurements (
    metric_family TEXT NOT NULL,    -- TIME_TO_PROVIDER, TOTAL_LOS
    start_event TEXT NOT NULL,      -- TRIAGE, REGISTRATION, DOOR
    end_event TEXT NOT NULL,        -- PHYSICIAN, PROVIDER, DISCHARGE
    statistic_type TEXT NOT NULL    -- P90, MEAN, ALGORITHMIC
);

Comparability logic:

comparable = (
    A.metric_family == B.metric_family and
    A.start_event == B.start_event and
    A.end_event == B.end_event and
    A.statistic_type == B.statistic_type
)

Good, because maintains exact provincial methodology
Good, because enables automatic comparability detection
Good, because database-enforced via enums (no drift)
Good, because demonstrates research methodology understanding
Bad, because cannot provide "average Canadian wait time"
Bad, because each scraper must tag correctly

Normalization Approach¶

Convert all measurements to standard metric (e.g., "Triage-to-Physician P90")

Good, because enables direct cross-province comparison
Good, because simpler UI (just show wait time)
Good, because can calculate national averages
Bad, because scientifically invalid (cannot convert Registration→Triage accurately)
Bad, because loses original methodology information
Bad, because creates false precision
Bad, because misrepresents provincial data

Free-Text Methodology¶

Store methodology as unstructured text field

Good, because flexible (any methodology)
Good, because preserves original description
Bad, because cannot programmatically determine comparability
Bad, because no enforcement (can drift over time)
Bad, because difficult to query/filter
Bad, because requires manual interpretation

Province-Specific Tables¶

Separate tables: measurements_ab, measurements_qc, etc.

Good, because each province can have custom schema
Good, because no cross-province confusion
Bad, because cannot query across provinces
Bad, because schema changes required for each new province
Bad, because violates DRY principle
Bad, because complex cross-province features

Links¶

[Related to] Architecture Database Guide - PostgreSQL enums chosen for ontology enforcement
[Refines] Strategic plan "Comparability Boolean" section

Additional Information¶

Ontology Field Definitions¶

metric_family - What is being measured? - TIME_TO_PROVIDER: Wait to see doctor/nurse - TOTAL_LOS: Total length of stay in ED - STRETCHER_OCCUPANCY: Current ED occupancy rate

start_event - When does the clock start? - TRIAGE: After initial triage assessment - REGISTRATION: After administrative check-in - DOOR: Upon physical arrival at ED - UNKNOWN: Source doesn't specify

end_event - When does the clock stop? - PHYSICIAN: Physician initial assessment - PROVIDER: Any provider (doctor, NP, PA) - DISCHARGE: Patient leaves ED - FIRST_ASSESSMENT: First clinical contact (any role)

statistic_type - How is value calculated? - P90: 90^th percentile (CIHI standard) - MEDIAN: 50^th percentile - MEAN: Average - ROLLING_AVG: Moving average (window unspecified) - ALGORITHMIC: Proprietary calculation - POINT_ESTIMATE: Current real-time value

Implementation Notes¶

Use PostgreSQL enums for strict enforcement
Create CHECK constraints as fallback
Pydantic models validate on insert
Frontend shows methodology warnings when comparing incomparable data
Auto-researcher generates "divergence briefs" for incomparable pairs

Real-World Example¶

Alberta (AHS):

{
    "metric_family": "TIME_TO_PROVIDER",
    "start_event": "TRIAGE",
    "end_event": "PHYSICIAN",
    "statistic_type": "P90"
}

Quebec (MSSS):

{
    "metric_family": "TIME_TO_PROVIDER",
    "start_event": "REGISTRATION",
    "end_event": "PHYSICIAN",
    "statistic_type": "MEAN"
}

Comparability Result: False (different start_event AND statistic_type)

Divergence Brief:

⚠️ Methodology Divergence: Alberta reports 90^th percentile Triage-to-Physician time, meaning 90% of patients are seen within the reported time. Quebec reports average Registration-to-Physician time, which starts the clock later and uses a different statistical measure. Direct comparison is scientifically invalid.

References¶

CIHI Emergency Department Performance Measures: https://www.cihi.ca/en/indicators/
Alberta AHS Methodology: https://www.albertahealthservices.ca/waittimes/waittimes.aspx
Quebec MSSS Methodology: https://www.quebec.ca/sante/urgences