When the Agents Outnumber the Thoughts

Ninety Sessions and a Thirteen-Hour Side Quest
Today was a day of volume. I ran somewhere north of ninety Claude sessions across Distill, VerMAS, and a handful of smaller projects. Most of them lasted under a minute: extract memory, classify content, generate image prompts, synthesize context. The pipeline doing its thing. But buried in the noise were two marathon sessions on VerMAS (nearly 7 hours and over 10 hours respectively) and a 13-hour-and-54-minute monster on Distill itself that touched 25 files across 12 directories. That last one was the real work. The rest was the system I built doing its job without me.
There's a specific feeling when you look at your session log and most of it is your own infrastructure talking to itself. Post-workflow analysts analyzing workflows. Context synthesizers synthesizing context. Memory extractors extracting memory. I counted six separate "post-workflow analyst" sessions today, each lasting about a minute, each dutifully producing learnings from completed multi-agent orchestrations. The recursion is getting deep enough that I should probably stop and ask whether the analysis of the analysis is adding signal or just generating heat.
I think it's adding signal. But I'm less sure than I was last week.

Cognitive Debt Has a Name Now
One of today's reads landed hard. Simon Willison linked to a piece on how generative and agentic AI shift concern from technical debt to cognitive debt. The framing crystallizes something I've been circling for days. Technical debt is code you understand but wish you'd written differently. Cognitive debt is code that works but that nobody fully understands anymore.
I'm accumulating cognitive debt right now. The Distill pipeline has a knowledge graph with context scoring, structural analysis, prompt injection, and persistence. It has a unified memory system that migrates across three different JSON formats. It has nine intake parsers, six publisher types, and a blog synthesizer that calls Claude via subprocess while injecting project context and editorial notes. I built all of it. I can still trace every wire. But the margin is shrinking.
The VerMAS sessions today were telling. Agents registering, heartbeating, acknowledging messages, signaling done. The supervised workflow for adding special character handling ran three separate dev-QA cycles across 11 minutes before converging. The roster workflows for pad-right, pad-center, indent, and truncate functions each spun up dev and QA agents that coordinated through blackboard writes and message passing. It works. It also means I now have agents whose primary activity is coordinating with other agents about coordination.
Eric Meyer's quote about CSS, which Willison also surfaced today, applies here too: "It's not bloated, it's fantastically ambitious. Its reach is greater than most of us can hope to grasp." That's generous. I'll take it.

Two Speeds of Inference, Two Speeds of Thinking
Sean Goedecke wrote about two different tricks for fast LLM inference, and it maps onto something I noticed in my own workflow today. The sub-minute sessions (memory extraction, entity classification, content classification) are all structured-output tasks. They don't need deep reasoning. They need pattern matching at speed. The multi-hour sessions are the opposite: open-ended exploration, architectural decisions, refactoring with side effects.
I've been treating both kinds of work the same way, routing them through the same pipeline with the same prompts and the same model. That's probably wrong. The hundred-odd classification tasks I ran today would have been fine with a smaller, faster model. The 14-hour Distill session needed every token of context it could get. Andrew Nesbitt's post about separating download from install in Docker builds makes the same point at a different layer: decompose your pipeline into phases that have different performance characteristics, then optimize each one independently.
Scott Werner's piece on architecture dominating material as complexity grows closes the loop. At some point the shape of your system matters more than the quality of any individual component. I have good components. The architecture connecting them is where the cognitive debt lives. The multi-agent orchestrations work, but every new workflow type adds another coordination pattern that I have to hold in my head (or, increasingly, delegate to agents who hold it in theirs).
Tomorrow I want to split the fast path from the slow path. Route the structured extraction tasks through something cheaper and faster. Save the full reasoning for sessions that actually need it. The pipeline that journals about itself should at least be efficient about the journaling.