2026-03-09
The entity-update job runs 12 times a day to detect new signals per ticker and re-synthesize entity state when something changed. Production showed an unmistakable pattern: 05:45 run resynthesized 13 entities, 07:45 run 12, 16:45 run 0. Same code, same data, every run after the first morning pass quietly skipped everything.
The bug was in the new_signals detection. It compared str(signal.created_at) against entity.updated_at.isoformat() as strings. Python’s str(datetime) produces a space separator ("2026-03-08 18:00:00+00:00"); isoformat() produces a T ("2026-03-08T18:00:00+00:00"). Space (0x20) sorts before T (0x54). So every same-day signal compared as older than the entity’s last update regardless of the actual timestamp. Only the first run after a date boundary, where the signal’s date string itself sorted later, could detect anything.
Fix was straightforward: parse created_at to datetime, ensure timezone awareness, compare the objects directly. Same change tightened an adjacent stale-override path from >=1 new signal to >=2, after a single bullish article had been flipping bearish entities with no counterbalance. The >=1 path had also paid for 362 unnecessary resynthesizes (~$4.34) producing identical output.
The trap with comparing stringified datetimes is that the comparison “works.” It returns a boolean, never raises, and stays plausible until the two stringifiers in the codebase disagree on a single non-printing detail. Lexicographic comparisons of formatted dates only behave like temporal comparisons when both sides go through the same formatter. Two formatters is one too many, and the failure mode is silent rather than loud.