On-Call Reporting

On-Call provides operational metrics that help service managers understand incident volume, response performance, escalation patterns, and on-call burden distribution.

Dashboard Metrics

The Dashboard (/) shows live real-time metrics:

Metric	Description
Active Incidents	Count of incidents in `triggered` or `acknowledged` state right now
Triggered	Incidents not yet acknowledged — actively escalating
Acknowledged	Incidents being actively worked
MTTA	Mean Time To Acknowledge — average minutes from incident creation to first acknowledgment (rolling window)
MTTR	Mean Time To Resolution — average minutes from incident creation to resolution (rolling window)
On-Call Now	Current on-call users across all active schedules with shift end times
Recent Incidents	Last 10 incidents with status, severity, and service

The dashboard refreshes live — no page reload needed.

Reports Page

The Reports page (/reports) shows 30-day aggregate analytics:

Summary Metrics

Metric	Description
Total Incidents	All incidents created in the last 30 days
Resolution Rate	Percentage of incidents that reached `resolved` status
MTTA	30-day mean time to acknowledge (minutes)
MTTR	30-day mean time to resolution (minutes)

Incidents by Severity

Bar chart showing incident count per severity level over the 30-day window:

Critical (red)
High (orange)
Medium (yellow)
Low (blue)
Info (gray)

Use this to understand your alert mix and whether your escalation policies are calibrated appropriately for the volume at each severity level.

Incidents by Service

Summary table showing incident count per service. Identifies which services are generating the most pages — useful for routing changes, alert threshold tuning, or staffing adjustments.

Reports API

For custom reporting and data export, use the Reports API:

Summary Report

GET /api/reports?report=summary&days=30

Response includes:

{
  "total_incidents": 47,
  "resolved_count": 42,
  "resolution_rate": 89.36,
  "mtta_minutes": 8.3,
  "mttr_minutes": 34.7,
  "by_severity": {
    "critical": 5,
    "high": 18,
    "medium": 20,
    "low": 4
  },
  "by_service": {
    "Tier 1 Alerts": 28,
    "Defend Critical": 12,
    "Backup Failures": 7
  }
}

Timeline Report

GET /api/reports?report=timeline&days=30

Returns daily incident counts for the requested window — useful for building trend charts in external dashboards.

Incident List (Default)

GET /api/reports

Returns the full incident list for the last 30 days with all fields, suitable for export to CSV or BI tools.

Escalation Rate

To calculate your escalation rate (incidents that required more than one escalation step):

Export the incident list via GET /api/reports
Filter for incidents where current_escalation_step > 0 at resolution time
Divide by total incidents

A high escalation rate (>20%) indicates your on-call technicians are not responding within your defined delay windows — consider shortening delays, adding backup notification methods, or reviewing coverage.

On-Call Burden Analysis

To understand how on-call load is distributed across your team:

Export the incident list for a reporting period
Group by acknowledged_by to see incident counts per technician
Compare to on-call hours covered by each technician (from the Schedules API)

Uneven burden distribution — one technician handling 70% of incidents — is a signal to rebalance the rotation or add team members.

Bridge API for External Reporting

For integration with external analytics tools (Power BI, Grafana, Tableau):

GET /api/bridge/incidents
X-Integration-Key: <key>
X-Tenant-Id: <tenant_id>

Returns incidents in a normalized format suitable for BI ingestion.

ℹ️Advanced reporting dashboards (on-call burden per technician, escalation rate over time, alert source volume trends) are on the roadmap for a future release. The Bridge API provides raw data for building these views externally today.

Interpreting MTTA and MTTR

MTTA (Mean Time To Acknowledge)

MTTA measures how quickly your on-call team responds to pages. Industry benchmarks for MSPs:

MTTA	Assessment
< 5 minutes	Excellent
5–15 minutes	Good
15–30 minutes	Needs improvement
> 30 minutes	Critical gap

High MTTA often indicates: technicians not receiving notifications (notification config issue), escalation delays set too long, or understaffed coverage windows.

MTTR (Mean Time To Resolution)

MTTR measures how long it takes to fully resolve incidents after they are created. This includes both response time and fix time.

MTTR	Assessment
< 30 minutes	Excellent for most alert types
30–60 minutes	Good
1–4 hours	Acceptable for complex issues
> 4 hours	Review resolution workflows

Very low MTTR may indicate issues are being resolved without full investigation (premature resolution). Very high MTTR may indicate incidents are being acknowledged but not actively worked.

Dashboard Metrics​

Reports Page​

Summary Metrics​

Incidents by Severity​

Incidents by Service​

Reports API​

Summary Report​

Timeline Report​

Incident List (Default)​

Escalation Rate​

On-Call Burden Analysis​

Bridge API for External Reporting​

Interpreting MTTA and MTTR​

Dashboard Metrics

Reports Page

Summary Metrics

Incidents by Severity

Incidents by Service

Reports API

Summary Report

Timeline Report

Incident List (Default)

Escalation Rate

On-Call Burden Analysis

Bridge API for External Reporting

Interpreting MTTA and MTTR