Skip to content

RELIABILITY.md

This file defines how the system proves it is healthy and restartable.

Standard Paths

  • Bootstrap: [command]
  • Verification: [command]
  • Start app or service: [command]
  • Debug or inspect runtime: [command]

Required Runtime Signals

  • structured logs for startup and critical flows
  • health checks for key services
  • trace or timing data for slow paths when available
  • user-visible error states for recoverable failures

Golden Journeys

  • [journey 1]
  • [journey 2]
  • [journey 3]

Each golden journey should have a repeatable verification path and clear failure signals.

Reliability Rules

  • No feature is complete if the system cannot restart cleanly afterward.
  • Runtime failures should be diagnosable from repo-local signals.
  • If a repeated failure mode appears, add a benchmark or guardrail for it.
  • Cleanup is part of reliability, not a separate concern.