you@laptop ~ $ git log --oneline tellr

Changelog

Releases of github.com/tellr/tellr. The diagrams below are wireframes, not screenshots. We ship these in the repo as docs/adr/*.md so anyone can see why we did it, not just what we did.

42 releases 3 breaking since 0.1 214 contributors apache-2.0
a3f21c

0.9.2 patch Alert composer retries, quieter nights

released 2 days ago 14 commits +312 −104 LoC by @amr, @mo, 3 others

Most of this release is about what happens when things almost go wrong, rather than when they do. The composer now retries on LLM 429s with jitter, and quiet hours learned about holidays.

  • add Composer retries on LLM 429 with exponential backoff (max 3 tries, jittered). If all fail, fall back to plain alert. #482
  • add silence.holidays: ["DE", "EG"] reads ical feeds and silences severity=info on those days. #461
  • fix Slack code blocks larger than 3 KB were truncated. Now split across messages with a continuation marker. #479
  • fix Postgres replica-lag check would crash if pg_stat_wal_receiver returned NULL. Handle the nil. #477
  • perf Dropped the SQLite WAL checkpoint from every 30s to every 5m. ~40% less disk churn on small hosts.
  • chg Default LLM model bumped to gpt-4o-mini. Same quality on our eval set, ~70% cheaper.
fig 01 alert composer retry path
429 → wait · jitter · retry (×3) check fails api.prod · 500 compose prompt metrics + 50 log lines openai gpt-4o-mini dispatch slack / tg / … sent ✓ ack store: 82ms fallback: plain alert (no llm) after 3 fails
dashed = retry/fallback (new) solid = happy path adr: docs/adr/0042-retry-composer.md
--- a/internal/composer/client.go
+++ b/internal/composer/client.go
- if resp.StatusCode != 200 { return nil, ErrUpstream }
+ switch resp.StatusCode {
+ case 200:        return parse(resp), nil
+ case 429, 503:   return nil, ErrRetry   // new: retry path
+ default:         return nil, ErrUpstream
+ }
7e0d1b

0.9.0 minor LLM checks: ask the model, not the metric

released 11 days ago 58 commits +2,104 −318 LoC by @amr, @mo

New check type: type: llm. Run a query, hand the result to a model, and ask a yes/no question. This is what we use to notice that our signup funnel is sad before any per-metric threshold fires.

  • add type: llm check. Give it a SQL query and a question; the model decides if it's a problem.
  • add tellr explain <name> writes a dry-run alert for any check without sending.
  • add Config hot-reload. Save tellr.yaml, changes apply on the next tick. No restart.
  • chg Alert payload renamed check_idcheck.name for consistency. Old field still emitted, will be removed in 1.0.
  • sec Signed releases with cosign. Checksums on every release page.
fig 02 llm check, end to end
config type: llm query: SELECT … ask: "is today weird?" your database postgres · mysql read-only creds result rows datecount 2026-04-24112 2026-04-23284 +5 more model gpt-4o-mini your key verdict yes, below 7-day median by 61% alert written → #ops-alerts 1. read 2. ship rows 3. ask 4. judge 5. dispatch
model sees only query rows + your question model never sees prompt templates, other checks, secrets
f02b8c

0.8.5 patch Silence rules, quiet hours, routes

released 3 weeks ago 22 commits +612 −91 LoC by @amr

The first release where tellr actually shuts up on purpose. Silence rules, weekly quiet windows, and per-severity routes.

  • add silence: block in config. between/and for one-offs, weekly: for recurring.
  • add routes: block. Match severity or tags, dispatch only to matching channels.
  • add tellr silence <name> --for 2h from the CLI, no config edit.
  • fix Webhook headers were dropped if the value contained : (e.g. bearer tokens with Bearer <jwt>).
fig 03 a week of alerts, routed
severity=error → slack + telegram severity=warn → slack severity=info → email (quiet) error warn info mon tue wed thu fri sat sun night silenced
shaded = silenced window (mon-fri 22:00-07:00) dot color = channel route
c9814f

0.8.0 minor Live TUI: tellr status --live

released 6 weeks ago 41 commits +1,842 −112 LoC by @mo, @amr

A terminal dashboard for when you want to watch it breathe. Redraws in place, no dashboards.

  • add tellr status --live terminal UI. Keyboard nav, q to quit.
  • add Built-in web status page at /status (the same thing you see on status.tellr.dev).
  • chg Check scheduler rewritten. Worst-case drift under load went from ~2s to ~40ms.
fig 04 tui layout
┌─ tellr · 3 targets · 11 checks ── ─┐ api.prod ok 82ms db.prod ok 43% workers ok q3 uptime 3d 14h 22m 127 rps silent 3d 14h events · · · · · · · · · · · · · · · · · · · · · · · · · · ▮ ▮ ▮ q to quit ─┘
amber = filled bar (load) red ▮ = hot event window source: internal/tui/dash.go
5d1f4a

0.7.5 patch Disk, process, queue

released 2 months ago 18 commits +428 −12 LoC
  • add type: disk, type: process, type: queue (sidekiq, bullmq, sqs).
  • fix Alert bodies rendered as plain text in Telegram when they contained a lone _. Escape them.
  • perf Cut cold-start binary size from 24 MB to 8.4 MB by dropping embedded docs from the runtime (they're on docs.tellr.dev now).
2b1e7d

0.7.0 minor First public release

released 4 months ago 1 commit (squashed)

0 → 1 The first time tellr left our own laptops. If you're reading this from the future, everything below this line is archaeology.

  • add HTTP, TCP, Postgres, MySQL, Redis checks.
  • add Slack, Telegram, Discord, email, webhook dispatchers.
  • add LLM-composed alert bodies (OpenAI + Anthropic + Ollama).
  • add Single-binary install, apache-2.0, no config required to start.
fig 05 architecture, v0.7
scheduler ticks + jitter http worker db worker shell worker event bus composer dispatcher sqlite store llm sinks
solid = in-process channel dashed = process boundary (the bus survives restarts via wal)
before

before 0.7 · private repo, ~80 prototype commits, nothing shipped