2026-05-06

Wired a cron-driven roster ingest on caissaresearch.com, the chess analytics product I run. The setup was a players_seed.yaml with curated entries plus an apply_roster_v0(conn) function that overlays per-handle config (auto_ingest_enabled = 1, auto_ingest_max_per_day = 5) for the 10 canonical handles, five players times two platforms. The seed runs on every Railway container start via scripts/run_pending_migrations.py. Idempotent on already-applied state.

The first live tick exposed the gap. 7 of 10 handles fetched cleanly. Three of the lichess handles 404’d on /api/games/user/. Manual check confirmed four of the five streamer lichess accounts return 404 on the games endpoint. The accounts exist as profiles; they have zero public games. There’s nothing for the cron to pull.

Trim the canonical list from 10 to 7. The four dead lichess handles get dropped. Re-run the seed.

The seed was a no-op on the dropped handles. They stayed auto_ingest_enabled = 1 from the original apply, because removing an entry from the canonical list is structurally invisible to a function that only writes when an entry exists. The cron kept hitting them every hour.

The fix was to make the apply step authoritative on both directions. After upserting the canonical handles, scan every handle owned by a roster player; any handle not in the canonical list gets auto_ingest_enabled = 0. Scope the disable to roster players only, so unrelated handles owned by non-roster players (admin sync, mining candidates) aren’t touched.

This pattern shows up anywhere a config file or a list constant is called “the source of truth for X.” The seed enables. The source of truth changes, an entry gets dropped, and the change is invisible to the seed because the seed only acts on rows that still exist in the file. The result is zombie config: handles, flags, schedules that haven’t been in the source of truth for weeks but are still running because nothing told them to stop.

A seed whose contract reads “source of truth for X” has to be authoritative on removal, not just installation. Otherwise its real contract is “source of truth for X-additions,” and the two diverge the first time you delete an entry — which is also the first time a seed is meant to be useful. Drift accumulates one dropped entry at a time, invisible until one of them actively breaks something downstream.