Vertiqa Help← Back to app

Calibration dashboard

How well a goal's confidence scores actually match its hit rate — reliability diagram, ECE, and the cold-start gate.

Calibration measures whether a goal's confidence scores can be trusted. A well-calibrated goal that says "85% confident" should be right about 85% of the time. The calibration page gives admins the data to know where each goal stands.

What you'll see

  • Cold-start gate — every goal goes through three states before calibration is meaningful:
    • Collecting — not enough decisions yet.
    • Warming — some signal, but unstable.
    • Calibrated — enough decisions for the metrics to be reliable.
  • Reliability diagram — confidence buckets (10 of them) plotted against actual hit rate. A perfectly calibrated goal traces the diagonal reference line.
  • Expected Calibration Error (ECE) — single-number summary of how far the curve sits from the diagonal. Lower is better.
  • Precision / recall table — classic accuracy metrics, broken down by routing zone.
  • ECE trend — sparkline over 7d / 14d / 30d so you can see if the goal is getting better or drifting.

What you can do here

  • Filter to view org-wide calibration or per-goal calibration.

How to read it

  • A calibrated goal sitting tight to the diagonal: trust the routing thresholds. You can safely loosen Autonomous if precision is high.
  • Confidence over-stated (curve below the diagonal): the goal thinks it's confident more often than it should be. Tighten Advisory.
  • Confidence under-stated (curve above the diagonal): you're over- reviewing. Loosen Advisory.

Tip

Don't tune thresholds based on a goal still in collecting or warming. The numbers are noisy until the gate flips to calibrated.

Related

Last reviewed