Skip to content

Alerting & notifications

Hora’s alerting philosophy is flapping never wakes you up: probes are retried before anything is recorded, alerts fire only after N consecutive failures, a cascade folds into one notification, and recoveries of alerts that were never sent stay silent too.

Channels are named, so you can have several of the same type and route each monitor to specific ones. Ten backends are built in:

typeNotes
telegramtoken + chat_id
discordwebhook_url
slackwebhook_url
matrixhomeserver + access token + room_id
ntfytopic url, optional token for private servers
gotifyserver url + application token
pushoverapplication token + user key
emailSMTP: host, port (587 STARTTLS default, implicit_tls for 465), from, to
freemobileFree Mobile SMS: user + pass
webhookPOSTs { event, monitor, message?, days_left? } as JSON to url
[[channels]]
name = "ops-telegram"
type = "telegram"
token = "${HORA_TELEGRAM_TOKEN}"
chat_id = "123456"
[[channels]]
name = "alerts-discord"
type = "discord"
webhook_url = "${DISCORD_WEBHOOK}"

An empty secret (an unset ${VAR}) simply disables that channel. Delivery retries transient failures, and down alerts include a snippet of the failing response body.

Routing: a monitor (or a peer) selects channels with notify = ["ops-telegram"]; without it, every configured channel is used.

Test the chain before you need it:

Terminal window
hora test-alert # a labelled test down + recovered through every channel
hora test-alert website # ... through exactly the channels routed for "website"

Any channel that fails logs a warning saying why (“chat not found”, HTTP 403, …).

[alerts]
fail_threshold = 3 # consecutive failures before a monitor is alerted down
alert_on_degraded = true # optional: also alert on degraded (same threshold)

A single failure shows the monitor as degraded on the page; only fail_threshold consecutive failures confirm down and fire the alert. Degraded alerts (up, but slower than the monitor’s degraded_over_ms) are opt-in and use the same anti-flap threshold.

When a database takes ten services down with it, you get one notification - the root cause, with its blast radius - not eleven. Dependent monitors (via depends_on) confirmed down within the grouping window fold into their upstream’s alert, and their recoveries stay silent too. A monitor that flaps entirely inside the window sends nothing at all.

[alerts]
group_window_secs = 30 # 0 restores one-alert-per-monitor

Scheduled windows mute alerts for the affected monitors; checks keep being recorded and the card shows a “maintenance” badge:

[[maintenance]]
title = "DB upgrade"
start = "2026-06-08T00:00:00Z" # RFC 3339
end = "2026-06-08T02:00:00Z"
monitors = ["database"] # empty = all monitors

The scriptable counterpart of a maintenance window - made for “mute while deploying”:

Terminal window
hora silence api,web 10m "deploying" # CLI, straight into the database
hora silence list
hora silence clear

or from CI over HTTP:

Terminal window
curl -fsS -X POST -H "Authorization: Bearer $HORA_TOKEN" \
"https://status.example.com/api/silence?monitors=api,web&duration=10m&reason=deploy"

Durations look like 10m, 90s, 1h30m (max 7 days - anything longer belongs in a visible maintenance window). Checks keep recording; only alert transitions are muted, picked up on the next tick. The HTTP endpoint strictly requires server.auth_token; unknown monitor ids are rejected so a typo’d hook fails loudly instead of silencing nothing. Expired silences are swept automatically.

The one notification that never signals a problem - a recap of the last seven days, sent on a cron schedule through your channels:

[digest]
schedule = "0 8 * * 1" # five-field cron, UTC (default: Monday 08:00)
notify = ["ops-telegram"] # optional; default: every configured channel
99.97% overall, 2 incidents
- API: 99.99%, 1 incident, budget 41m of 43m left (30d)
- Web: 100.00%

One line per monitor: uptime, incidents in the window, and the error budget left when an SLO is configured. The last-sent timestamp persists in the database, so a restart neither double-sends nor forgets - and a send missed while the daemon was down catches up once. Preview the exact text anytime with hora digest (a dry run; it notifies no one).

Banners pinned on the status page (and the per-group pages), independent of any monitor - the mini-Statuspage half of self-hosted monitoring. Two ways to pin one:

Ad hoc, from the CLI or a remote API call - made for “during the incident”:

Terminal window
hora announce "Fibre incident" "ETA 6pm" --severity warning --until 4h
hora announce list
hora announce clear
Terminal window
curl -X POST -H "Authorization: Bearer $TOK" \
"https://status.example.com/api/announce?title=Fibre+incident&severity=warning&until=4h"
curl -X DELETE -H "Authorization: Bearer $TOK" "https://status.example.com/api/announce"

--until (a duration like 4h, or 18:00 UTC) auto-expires the banner, so the classic stale “incident ongoing” banner three days later cannot happen by default. The API requires server.auth_token and the banner shows immediately (the summary cache is busted on write).

Declared in the config - for planned, longer-lived notices, or a GitOps workflow where announcements go through git:

[[incidents]]
title = "Investigating elevated latency"
body = "We are looking into reports of slow responses."
severity = "warning" # info | warning | critical | resolved
at = "2026-06-07T12:00:00Z"

https:// monitors are warned alerts.cert_expiry_days before their certificate expires (default 14), through the same channels and routing.