Backup Health Monitoring
The backup health dashboard gives you a real-time view of protection status across every device and every client. You can see at a glance which devices are healthy, which have recent failures, and which are overdue for a backup.
Health Dashboard
The main dashboard has two views:
Per-Device View — Every protected device with its current health status, last successful backup time, policy assigned, and storage used.
Per-Client View — Rolled-up health status per company: how many devices are healthy vs. warning vs. critical, and the overall compliance percentage for the client.
Navigate between views using the Devices and Clients tabs.
Health Score
Every device has a health score from 0–100:
| Score Range | Status | Meaning |
|---|---|---|
| 90–100 | Healthy | Last backup successful within the policy schedule; recent DR test passed |
| 70–89 | Warning | 1–2 recent job failures, or backup is slightly overdue |
| 40–69 | Critical | 3+ consecutive failures, backup is more than 48 hours overdue, or last DR test failed |
| 0 | Unknown | No backup jobs have run yet, or agent has not checked in |
The health score is calculated from:
- Recency — How recently was the last successful backup relative to the policy schedule?
- Success rate — What percentage of jobs in the past 7 days succeeded?
- DR test result — Did the most recent restore test pass? (weighted heavily — a failed DR test drops the score significantly regardless of job success rate)
- Storage health — Is the device approaching storage quota limits?
Alert Types
The following alert types are generated automatically:
| Alert | Trigger | Default Severity |
|---|---|---|
| Backup job failed | A backup job ends with status failed | Warning (1st), Critical (3rd consecutive) |
| Backup missed schedule | No successful backup within 1.5× the policy interval | Warning |
| Backup overdue | No successful backup for 24+ hours (workstation) or 12+ hours (server) | Critical |
| Storage quota exceeded | Device backup storage > 95% of allocated quota | Warning |
| SaaS backup overdue | No successful SaaS job within 36 hours of scheduled time | Warning |
| SaaS consent expired | OAuth token refresh failed — connection needs re-authorization | Critical |
| DR test failed | Restore verification test returned a failure result | Critical |
Alerts appear in the Alerts tab of the Backups console. Each alert shows the device, alert type, trigger time, and recommended action.
SLA-Based Compliance
Define backup SLAs at the tenant level or per client:
- Go to Settings → SLA Configuration
- Set the compliance rule:
- "All workstations must have a successful backup within 24 hours"
- "All servers must have a successful backup within 12 hours"
- The Compliance view shows which devices are meeting the SLA and which are in breach
The compliance report is available as a PDF or CSV export for customer QBRs and audit documentation.
Backup Monitoring Reports
A weekly summary report is automatically emailed to the MSP admin:
- Total devices protected
- Devices healthy / warning / critical / unknown
- Job success rate for the week
- Any new critical alerts
- Devices with the lowest health scores (top 5)
- Storage usage trend
To configure report recipients and delivery day:
- Go to Settings → Reports
- Enable Weekly Health Summary
- Add recipient email addresses
- Choose the delivery day (default: Monday morning)
PSA Integration
Backup failures automatically create tickets in The One PSA when the integration is active.
| Alert Type | PSA Ticket Behavior |
|---|---|
| Backup job failed (3rd consecutive) | Creates a Critical ticket — type: Backup Failure |
| Backup overdue (>24h) | Creates a High ticket — type: Backup Overdue |
| SaaS consent expired | Creates a Critical ticket — type: SaaS Connection Error |
| DR test failed | Creates a High ticket — type: DR Test Failure |
Tickets include the device name, client, last successful backup time, and a link back to the backup job log in the Backups console.
To configure PSA integration thresholds:
- Go to Settings → Integrations → PSA
- Enable Auto-create tickets on failure
- Set the failure threshold (default: 3 consecutive failures before ticket creation)
- Map backup alert types to PSA ticket types and priorities
X-Integration-Key service-to-service channel. The integration key must be set in both the Backups and PSA environments. Contact your platform admin if tickets are not being created.On-Call Escalation
Critical backup SLA breaches can escalate to On-Call:
- Go to Settings → Integrations → On-Call
- Enable SLA breach escalation
- Define the threshold: "If a server backup is overdue by more than 6 hours outside business hours, page on-call"
- Set the escalation policy (maps to an On-Call schedule)
On-Call escalation is separate from PSA ticket creation — both can be active simultaneously.
Related Pages
- DR Testing — Automated restore verification
- Integrations — PSA, RMM, Defend, and CMDB integration details
- Troubleshooting — Resolving backup failures and health issues