2026-05-06

Plausible-shaped errors are the failure mode of LLM-drafted methodology docs. A construct-validity mapping I scaffolded with Claude (467 lines linking my kernel’s symbolic-state primitives to Gottman, EFT, and adult-attachment constructs) went through a primary-source verification pass before integration. Several v0.5 claims didn’t survive.

agent-lab is a typed-state simulation kernel I built where two LLM agents run an N-turn dyadic dialogue and emit per-turn structured reflections (trust deltas, warmth deltas, sacred-cow violations). Before claiming the kernel “measures dyadic constructs,” I needed a citation-grounded mapping from primitives to literature.

Three categories of error surfaced in the verification pass.

Wrong title. Makinen & Johnson (2006) was scaffolded as “Steps toward a process-resolution model”; the actual title is “Steps Toward Forgiveness and Reconciliation.” A plausible synonym. The citation database hits the wrong paper.

Overclaimed magnitudes. “Contempt is the strongest predictor of divorce” was scaffolded as load-bearing; the literature treats it as one of four horsemen, not ranked. The 86%/33% bid-acceptance percentages from Gottman’s bid research were stated as kernel thresholds; in the source they’re cohort statistics, not normative targets. The 5:1 positive-to-negative interaction ratio was framed as a kernel threshold; in Gottman it’s a system-level reference point (and the actual measured ratios from Gottman & Levenson 1992 are 5.76 / 5.82 vs 0.67 / 1.06, not a clean 5:1).

Missing sources. Gottman & Levenson (1992) for the actual ratio numbers, Gottman (1993) typology, Millikin (2000) attachment-injury precursor; none of these were in the v0.5 reading list, and several v0.5 claims were quietly downstream of them.

The discipline is to keep the LLM-scaffolding step but make verification a separate phase. Scaffolding gets you a structure to argue with; verification keeps you from publishing the argument the model imagined. The marker convention ([verify-X] inline, per-claim attribution required before the marker comes out) made the boundary visible during the read. Twelve sources verified across Gottman, EFT, adult-attachment, and Karpman lineages; markers don’t come out until the source ledger entry exists.

For methodology docs that get cited externally, skipping the verification phase isn’t a subtle failure. It’s a public claim with the wrong citation underneath.