2026-05-03

The end state looks clean: 251,166 completed position evals, 0 pending, 0 running, 0 error. That number hides the important part. The backfill finished because the rollout was staged to make surprises small.

The plan started with one-position and one-game smoke tests, then moved through repeated 1, 5, 10, and 100 game runs before the larger 500 game chunks. Every run had an analysis_jobs row before workers started. Every batch had a ledger row. Every plan had a cost cap. Modal received claimed position keys, not permission to discover work on its own. Railway and SQLite stayed the source of truth.

The ladder surfaced problems in the right order. Small runs validated signed write-back and job counters. Mid-size runs exposed a MultiPV tail-depth failure. Larger runs exposed SQLite parameter limits and duplicate callback accounting. The final chunks exposed a late error callback that arrived after a batch had already completed. Each fix changed the platform before the next scale step.

The final tuning point was batch size 32, max_containers=100, and a 300s per-position cap. At that shape, the larger runs sustained roughly 47-60 positions/sec, with write-back p95 under a second and queue status returning clean after every chunk.

The winning abstraction was not “send games to Modal.” It was:

game corpus -> pure position keys -> durable queue -> Modal batches -> signed write-back -> materialized move evals

That gave every unit of work an identity before compute started and a final status after compute ended. It also made retries surgical. A failed 32-position batch could become four 8-position batches, then single-position retries, without losing track of the parent job or double-counting successful work.

The rollout stayed boring by design. Cost caps prevented surprise spend. Idempotent counters handled duplicate callbacks. Late errors could not poison completed batches. Queue status made the stop condition visible. Production fan-out worked because the control plane was more important than the workers.