System Health
Live self-check of services, DB, keys, cert, disk, memory, integrity chain. Repair actions audited.
Checks
Each row is ok / warn / fail. Warn and fail rows show a Repair button (when a remediation exists) and an Open → link to the admin page that owns the category.
Categories
- service — systemd
is-active. Non-existent units on this host are hidden, not failed. - db — simple
SELECT 1; also flags when the row-hash chain ofaudit_eventsis broken. - resource — disk %, memory %. Warns at 85 %, fails at 95 %.
- cert — portal TLS cert presence + expiry.
- key — master key + row-HMAC key file perms (0400 mandatory).
- integrity — last
db-integrity-scanjob result and age.
Repair actions
One-click repairs include restarting a stopped service, running a one-off integrity rescan, running a retention pass, attempting rndc reload, and certbot renew. Each is audit-logged with action + outcome + detail.
Gotchas
- Destructive repairs (disabling a service) require two-person approval.
- If
cert:renewerrors with "no Let's Encrypt certs registered", this install is using self-signed / Cloudflare origin certs — the repair is a no-op.