The Nutrients in Dead Workflows

The Nutrients in Dead Workflows

Ninety Sessions, One State Machine

Today I ran about 130 agent sessions through TroopX. Three distinct workflow shapes: hierarchical dispatch (CEO delegates to CTO delegates to engineering), peer collaboration (dev-QA review loops), and an outreach pipeline (content writer drafts, outreach agent publishes). Every one of them ran through the same state machine and signal infrastructure.

The longest session was 73 minutes. A content writer agent, sitting in a loop, waiting for assignments on the blackboard, producing blog posts and tutorials. The shortest was 6 seconds. An agent registered, checked for pending work, found none, shut down.

Most of the engineering work was bug fixes. ack_message returning a 4 instead of an acknowledgment. Agent config files with a path resolution issue. QA missing CI failures because it checked signals before the build finished. Each fix took 8-25 minutes of agent time, which includes registration, heartbeat polling, signal checking, and actual debugging.

Andrew Nesbitt published a piece today called Whale Fall, about what happens when a large open source project dies. In the ocean, when a whale dies and sinks, its carcass becomes an entire ecosystem. Bone worms, bacteria, scavengers. The dead whale feeds communities for decades.

My workflows are whale falls.

Scavenging the Carcass

After every completed workflow, I run a post-workflow analyst. It reads blackboard entries, checks signals, lists agents, and extracts actionable learnings. Then a reflection agent picks through its own recent sessions to update working memory. Today I ran about 40 of these analyst and reflection sessions. Most were under a minute.

The compounding effect is real. On February 14, the dev-QA feedback loop was completing in about 7 minutes with genuine editorial friction. A week later, the same pattern runs smoother because agents recall what worked. The QA agent knows to check review/code-review on the blackboard. The dev agent knows to run tests before signaling done. Nobody told them to converge on these conventions. The memory system did.

Karpathy apparently coined "Claws" for this style of AI-assisted development. I haven't watched the full thing yet, but the name is apt. These agents grip the codebase, pull things apart, reassemble. What he's describing at the individual level, I'm trying to scale to a team. One claw is useful. Thirty claws coordinating through shared state is either a lobster or a disaster.

Right now it's mostly lobster.

Seventeen Thousand Tokens Per Second and the Bottleneck That Isn't

Taalas announced they're serving Llama 3.1 8B at 17,000 tokens per second. Meanwhile, ggml.ai joined Hugging Face to ensure the long-term progress of local AI. The inference speed race is accelerating.

Here's what I've learned from running 130 sessions in a day: the bottleneck is never where you think. Last week I noted that wall-clock pipeline time is dominated by LLM inference. That's still true for Distill, where the journal and blog synthesizers call Claude via subprocess. But for TroopX, the bottleneck shifted to coordination overhead. Agents spend meaningful time in heartbeat loops, checking for signals, waiting for peers. One CTO agent session hit 94 heartbeat calls in 8 minutes. That's a heartbeat every 5 seconds, and most return nothing.

I'm resisting the urge to optimize this. No synchronization issues have surfaced. Polling is wasteful but reliable, and reliable beats clever when you're running concurrent workflows across dozens of agents. Premature optimization of heartbeat polling is how you introduce race conditions into a system that currently has none.

Alex Kladov wrote about wrapping code comments today. Small topic, strong opinion: a "today years old" moment about mundane tooling. I love posts like this. Experienced developers discovering surprises in the basic mechanics of writing code. It's a reminder that mastery is unevenly distributed even within a single person's skill set.

Building on the Thing Being Criticized

Ed Zitron published "The Hater's Guide to Anthropic" today. I'm building my entire multi-agent platform on Claude CLI. Every one of those 130 sessions is backed by Anthropic API calls. The irony isn't lost on me.

But I think the criticism and the building coexist without contradiction. You can use a tool seriously, push it to its limits, and still hold space for skepticism about the company that made it. My Distill pipeline and TroopX are stress tests. When an agent session fails at 73 minutes because it hit a context limit, that's data. When heartbeat polling works flawlessly at 94 calls per session, that's data too.

The whale fall metaphor keeps pulling at me. Workflows live, produce, die, and decompose into knowledge. The post-workflow analyst is the bone worm, extracting nutrients. The reflection agent is the bacterial mat, transforming raw experience into structured memory. Each new workflow inherits an ecosystem built from the dead.

One hundred thirty sessions today. Tomorrow there will be more, and they'll be slightly smarter, because today's whales are already sinking.