Skip to content

The Resilience Doctrine

A cinematic colorful sci-fi cross-section of a fortified home-server bunker in Quebec bedrock — a glowing teal Mac Mini core wired to an amber EcoFlow battery tower, ringed by defensive halos.

Every prior doctrine here quietly assumed the lights stay on, a cloud seat is one handshake away, and a human will notice a red dashboard. On a bad enough day — grid down, uplink dead, HuggingFace gone, operator unreachable, Montreal a no-go — each of those is a single point of failure nobody was watching.

The Resilience Doctrine removes those assumptions: seven small, fail-safe, mostly-dormant pieces, each giving the haus a sense it lacked — of its power, its uplink, its mortality, and its seed. None acts irreversibly unless two independent gates agree, and most ship disarmed by design, logging exactly what they would do so the first real action is never a surprise.

PieceDaemon / toolCadenceWhat it gives the haus
Power-sense + graceful shutdowncom.sanctum.power20s loopa sense of grid-vs-battery; a clean halt before the cell dies
Operator dead-man’s-switchcom.sanctum.deadmanevery 6ha sense of its own mortality — a bell, never a gavel
WAN sentinelcom.sanctum.wan-sentinelevery 60sa sense of its uplink; auto-flip to local seats
Offline-mode failoversanctum-offlinemanualre-point every cloud seat onto the cathedral, reversibly
R2 break-glass backupsanctum-r2-breakglassdailya geo-redundant rebuild seed on free-tier Cloudflare R2
Cold model-weights archivesanctum-cold-archivemanuala cold, integrity-verified copy of the kyber
Continuity paper layer~/.sanctum/continuity/papera floor that needs no power

A minimalist graphite pencil sketch on dark charcoal: seven nested protective rings around a glowing teal server cube, ringed by hand-drawn battery, hourglass, strongbox, and folded-letter icons.

The EcoFlow Delta 3 Max Plus backing the Mini passes mains through, so macOS only ever sees “AC Power” — blind to grid-vs-battery. sanctum-power.py (com.sanctum.power, KeepAlive, 20s POLL_INTERVAL) gives the haus that missing sense, reading state-of-charge over the EcoFlow IoT Open API (GET https://api-e.ecoflow.com/iot-open/sign/device/quota/all, HMAC-SHA256 signed).

The state machine is GRID → BATTERY → CRITICAL (thresholds WARN_SOC=40, CRITICAL_SOC=25, SHUTDOWN_SOC=15, env-overridable). The destructive halt — the most-gated action in the campaign — needs four conditions at once:

Halt gateCondition
StateCRITICAL (SoC ≤ SHUTDOWN_SOC)
Below-shutdown debouncesustained past SHUTDOWN_DEBOUNCE (60s)
Off-mains debouncesustained past ON_BATTERY_DEBOUNCE (45s)
ArmedPOWER_ARMED is true

The shutdown is a bounded ordered sequence — quiesce the cathedral, stop the proxyd writers (signal-cli), limactl stop the VM, sync, shutdown -h now — each step timeout-bounded, so a hung step is skipped and the run always reaches the protective sync and halt. pmset is corroboration only — under pass-through it always reads “AC Power,” so it never vetoes the EcoFlow sensor.

Nothing watched the WAN, so a dead uplink used to be noticed only when a cloud council call hung mid-answer. sanctum-wan-sentinel.py (com.sanctum.wan-sentinel.plist, one tick every 60s) closes that. It TCP-probes three anchors on three operators — Cloudflare 1.1.1.1:443, Google 8.8.8.8:53, Quad9 9.9.9.9:443 — plus a real recursive resolve of one.one.one.one. Verdict is UP if any signal succeeds, so no one provider’s bad day reads as a WAN loss (a RST counts as UP — the packet still round-tripped).

The clever part is the WAN-vs-LAN distinction, made via the default gateway: a blackout plus a reachable gateway means WAN_DOWN (uplink dead, LAN fine); an unreachable gateway means LAN_DOWN; an unreadable gateway means UNKNOWN, which holds the prior state. A down verdict needs a 2-cycle debounce; recovery is trusted immediately. When armed (SANCTUM_WAN_AUTO_OFFLINE in {1,true,yes}) and WAN_DOWN, it calls sanctum-offline on; on LAN_DOWN it pages only — local seats cannot help an islanded box.

A sense of where to think when the cloud is gone

Section titled “A sense of where to think when the cloud is gone”

sanctum-offline is the hands the WAN sentinel reaches for — a manual, reversible toggle, not a daemon. It snapshots the live proxyd config (~/.sanctum/sanctum-proxy/config.yaml) to config.online-snapshot.yaml, atomically rewrites every cloud seat and the fallback graph onto the cathedral (qwen3.6-35b-a3b-4bit at https://127.0.0.1:1337, mtls); proxyd hot-reloads on the change — no restart. The code seat (Codestral on :3301) is already local and is left untouched; off restores the snapshot bytes verbatim, then deletes it.

A “cloud” seat is one whose provider is openrouter or whose api_base carries :3456 (claude) or :6543 (gemini). Rather than half-migrate, it refuses:

RefusalWhy
exit 5generated config still references a cloud backend
exit 4a snapshot already exists with live cloud refs — won’t clobber the true online seed
exit 2live config missing or unparseable (off also refuses a bad snapshot)

It does not health-probe the local backends — it audits config text, not silicon — so a down cathedral still toggles green, then answers nothing.

Every key in the haus gates on Bert; if he becomes unavailable, the Sanctum is a black box the family cannot open. sanctum-deadman.py (com.sanctum.deadman, StartInterval 21600s ≈ 6h) closes that gap. Bert proves liveness with sanctum-alive (weekly is plenty); any beat resets the ladder to LIVE, silence climbs it:

TierDays silentWhat fires
LIVEbeat freshnothing
REMIND3dself-reminder to Bert (warn)
CHALLENGE6dout-of-band liveness challenge (critical)
ALERT9dlouder; executor heads-up is staged, not sent (critical)
FLAG12draise deadman-PRESUMED-ABSENT.flag + audit bundle (critical)

It is a bell, not a gavel: notify-only, never auto-contacting family — the flag itself says “This is NOT proof. Follow the Letter of Instruction.” Liveness is not gated on power or WAN; a missed beat with everything else up still climbs. The audit at ~/.sanctum/state/deadman-audit.jsonl is append-only (UF_APPEND). An alternate --cancel-code channel resets the ladder when the CLI is out of reach; that code lives only on a gitignored paper card in the fire-safe — never quoted here.

Two cold layers survive Montreal itself. sanctum-r2-breakglass ships the minimal rebuild seed — secrets, certs, runbooks, scripts, sentinels, SOPS keys, memory — via restic to a free-tier Cloudflare R2 bucket (sanctum-breakglass), client-side encrypted with the same keychain passphrase (sanctum-backup-key) as every other sanctum repo. Sized to fit R2’s free 10 GB ceiling (~135 MB raw, well under 1 GB after dedup), it alerts past FREE_TIER_WARN_GB=8; a companion sanctum-r2-mint provisions and SigV4-verifies the S3 creds. The full 25 GB restic repo deliberately stays off R2 — it lives on the T9 SSD and Google Drive.

sanctum-cold-archive copies the kyber itself. sanctum-backup.sh excludes the model weights as “re-downloadable” — but in a HuggingFace-down world they are not, and the council goes mute. This tool tars each cached model (~/.cache/huggingface/hub, ~/.cache/lm-studio/models) through zstd -19, optionally age-encrypts it, and records a sha256 in MANIFEST.json so --verify can prove every archive later. It is resumable and never mutates its sources.

Under all of it is ~/.sanctum/continuity/ — a printed, secret-free kit in the fire-safe: a Letter of Instruction (whose Section 4 is the data-recovery procedure), a family one-pager, a service-restart cheatsheet, blank ICE medical cards, and a wartime doctrine. Every credential is a pointer (a 1Password item name, a keychain service name), never a value. The one secret-bearing document, the dead-man cancel card, is paper-only and gitignored — never quoted, photographed, or committed.

The wartime doctrine codifies the rails the whole campaign obeys (Council ruling, 2026-06-19): there is no automated wipe, and if one were ever built it must require two hands, a cooldown to abort, default to safe on uncertainty, and never fire on ambiguity.

Dormant by design — what arms each piece

Section titled “Dormant by design — what arms each piece”

Most pieces ship inert. The gate is always a credential or a flag only Bert should set, so nothing acts until he says so.

PieceDormant stateWhat arms it
sanctum-powerno EcoFlow keys → UNKNOWN, silentecoflow-{access-key,secret-key,sn} in ~/.sanctum/secrets/
sanctum-power haltlogs the plan, never haltsPOWER_ARMED in {1,true,yes}
sanctum-wan-sentinellogs disarmed:page-onlySANCTUM_WAN_AUTO_OFFLINE in {1,true,yes} + sanctum-offline present
sanctum-r2-breakglasslogs “DORMANT”, exits 0R2 creds in keychain via sanctum-r2-mint
sanctum-cold-archivedry-run onlyCOLD_ARCHIVE_DEST on a mounted, writable drive
continuity kit[FILL-IN] placeholdersBert hand-fills the paper templates

The honest version: the haus is ready to be made ready. The reflexes are built, exercised by zero-network self-test fixtures, and wired to Force Flow (http://127.0.0.1:4077/notify) — but most ship disarmed, and several stay dormant until Bert provisions a credential or a drive only he should hold. That is the doctrine, not a gap: a reflex you cannot trust to fail safe is worse than none, and family contact stays human-gated by council non-negotiable.

What changed is the floor. Before this campaign every one of those failure modes was silent; now each has a sense, a sentinel, and a printed page in a fire-safe that survives even the loss of the box holding everything else.