Skip to main content

Backup Health Monitoring

The backup health dashboard gives you a real-time view of protection status across every device and every client. You can see at a glance which devices are healthy, which have recent failures, and which are overdue for a backup.

Health Dashboard

The main dashboard has two views:

Per-Device View — Every protected device with its current health status, last successful backup time, policy assigned, and storage used.

Per-Client View — Rolled-up health status per company: how many devices are healthy vs. warning vs. critical, and the overall compliance percentage for the client.

Navigate between views using the Devices and Clients tabs.

Health Score

Every device has a health score from 0–100:

Score RangeStatusMeaning
90–100HealthyLast backup successful within the policy schedule; recent DR test passed
70–89Warning1–2 recent job failures, or backup is slightly overdue
40–69Critical3+ consecutive failures, backup is more than 48 hours overdue, or last DR test failed
0UnknownNo backup jobs have run yet, or agent has not checked in

The health score is calculated from:

  • Recency — How recently was the last successful backup relative to the policy schedule?
  • Success rate — What percentage of jobs in the past 7 days succeeded?
  • DR test result — Did the most recent restore test pass? (weighted heavily — a failed DR test drops the score significantly regardless of job success rate)
  • Storage health — Is the device approaching storage quota limits?

Alert Types

The following alert types are generated automatically:

AlertTriggerDefault Severity
Backup job failedA backup job ends with status failedWarning (1st), Critical (3rd consecutive)
Backup missed scheduleNo successful backup within 1.5× the policy intervalWarning
Backup overdueNo successful backup for 24+ hours (workstation) or 12+ hours (server)Critical
Storage quota exceededDevice backup storage > 95% of allocated quotaWarning
SaaS backup overdueNo successful SaaS job within 36 hours of scheduled timeWarning
SaaS consent expiredOAuth token refresh failed — connection needs re-authorizationCritical
DR test failedRestore verification test returned a failure resultCritical

Alerts appear in the Alerts tab of the Backups console. Each alert shows the device, alert type, trigger time, and recommended action.

SLA-Based Compliance

Define backup SLAs at the tenant level or per client:

  1. Go to SettingsSLA Configuration
  2. Set the compliance rule:
    • "All workstations must have a successful backup within 24 hours"
    • "All servers must have a successful backup within 12 hours"
  3. The Compliance view shows which devices are meeting the SLA and which are in breach

The compliance report is available as a PDF or CSV export for customer QBRs and audit documentation.

Backup Monitoring Reports

A weekly summary report is automatically emailed to the MSP admin:

  • Total devices protected
  • Devices healthy / warning / critical / unknown
  • Job success rate for the week
  • Any new critical alerts
  • Devices with the lowest health scores (top 5)
  • Storage usage trend

To configure report recipients and delivery day:

  1. Go to SettingsReports
  2. Enable Weekly Health Summary
  3. Add recipient email addresses
  4. Choose the delivery day (default: Monday morning)

PSA Integration

Backup failures automatically create tickets in The One PSA when the integration is active.

Alert TypePSA Ticket Behavior
Backup job failed (3rd consecutive)Creates a Critical ticket — type: Backup Failure
Backup overdue (>24h)Creates a High ticket — type: Backup Overdue
SaaS consent expiredCreates a Critical ticket — type: SaaS Connection Error
DR test failedCreates a High ticket — type: DR Test Failure

Tickets include the device name, client, last successful backup time, and a link back to the backup job log in the Backups console.

To configure PSA integration thresholds:

  1. Go to SettingsIntegrationsPSA
  2. Enable Auto-create tickets on failure
  3. Set the failure threshold (default: 3 consecutive failures before ticket creation)
  4. Map backup alert types to PSA ticket types and priorities
ℹ️The PSA integration uses the X-Integration-Key service-to-service channel. The integration key must be set in both the Backups and PSA environments. Contact your platform admin if tickets are not being created.

On-Call Escalation

Critical backup SLA breaches can escalate to On-Call:

  1. Go to SettingsIntegrationsOn-Call
  2. Enable SLA breach escalation
  3. Define the threshold: "If a server backup is overdue by more than 6 hours outside business hours, page on-call"
  4. Set the escalation policy (maps to an On-Call schedule)

On-Call escalation is separate from PSA ticket creation — both can be active simultaneously.