When Coordination Overhead Exceeds Task Value
Every multi-agent system has a break-even point. Below it, the cost of coordinating agents exceeds the value their coordination produces. Above it, the protocol overhead amortizes across enough genuine work that the investment pays off. The interesting question isn't whether this threshold exists — it obviously does — but whether you can identify it before you've already paid the coordination tax.
Gene Amdahl formalized a version of this problem in 1967. His law states that the maximum speedup from parallelization is bounded by the fraction of work that must remain serial. Add more processors and you hit a ceiling determined entirely by the sequential bottleneck. The insight transfers cleanly to multi-agent coordination: registration, context loading, message routing, and signal acknowledgment form a serial floor that every task pays, regardless of complexity. As task size shrinks, that serial fraction dominates the total cost. At some point, the overhead is the work.
I've been running dev-QA workflow pairs through TroopX, a multi-agent orchestration platform built on Temporal, FastAPI, and Claude CLI, for the past three weeks. The evidence for this threshold is now impossible to ignore.
The Protocol Tax
On February 20th, I ran twenty agent sessions across seven hours. The longest was two hours and twenty-four minutes: building an HTTP endpoint so agents could communicate with the router over REST instead of stdio MCP transport. It modified twenty-eight files, executed 254 shell commands, and required ninety-eight file reads. The dev-QA friction loop caught real integration issues. Mismatched response schemas, missing test coverage for error paths, routing conflicts between the new HTTP handlers and existing MCP endpoints. Coordination overhead was a small fraction of total effort and produced genuine quality improvements.
The same day, sessions one and two ran a utility function task. A dev agent implemented truncate_string in utils/text.py. The QA agent verified the commit diff. The function was trivial. The protocol was identical: register agents, load context, route messages through the blackboard, poll for signals, run post-workflow analysis. The coordination ceremony consumed more wall-clock time than the implementation itself.
Both tasks paid the same protocol tax. One got its money's worth.
Half the Sessions Were Meta-Work
Here's the number that crystallized the problem. On that same February 20th, ten of twenty sessions were not implementation. Sessions five, eight, ten, fifteen, and twenty were post-workflow analysis. Sessions eleven through thirteen, sixteen, and seventeen were agent reflection: agents reviewing their own performance and updating their working memory files. Half the compute went to agents thinking about what they'd done rather than doing anything new.
For the HTTP endpoint task, this ratio was justified. Analysis extracted patterns about cross-module integration testing that improved subsequent sessions. Reflection steps updated agent memory with specific debugging strategies for Temporal workflow testing. The meta-work produced forward value.
For the utility function task, the same analysis and reflection ran with the same thoroughness. An analyst reviewed the truncation implementation and extracted "learnings." A reflection agent updated its memory file with observations about string boundary handling. This is a system spending more time journaling about cutting bread than it spent cutting bread.
The overhead isn't just per-task startup. It's the entire ecosystem of supporting sessions that orbit every piece of real work. When I traced the full lifecycle of a simple task through the platform, the ratio of coordination sessions to implementation sessions was often two-to-one or worse. Registration. QA review. Signal verification. Analysis extraction. Agent reflection. Five sessions to support one session of work.
The Invisible Floor
The floor manifests hardest at the session boundary. Every agent, no matter how brief its assignment, requires registration with the router, loading of roster context and working memory, codebase orientation, and tool initialization. During a stress test in early February, I watched an architect agent join a one-minute planning session and spend its first thirty seconds on setup alone. Half its useful window was gone before it examined a single file.
This isn't a bug to fix. It's a structural property. Agents need context to make decisions. Context loading takes time. Shrinking the task doesn't shrink the context requirement. If anything, the agent needs the same project understanding to make good decisions about a small change as it does about a large one. The coverage configuration that trips up isolated test runs doesn't become less relevant just because the function under test is simple.
The floor creates a perverse dynamic. Tasks that would take a single developer thirty seconds to implement and visually verify take five to ten minutes of agent ceremony. The ceremony exists because it catches real problems on complex tasks. But it runs on every task indiscriminately.
When the Ratio Flips
The threshold lives at roughly the point where the task touches shared state.
On February 3rd, a self-improvement cycle ran across five agent roles: analyst, architect, planner, executor, evaluator. It modified forty-six files and ran 640 shell commands over nearly eight hours. The coordination overhead was substantial. Five separate Claude Code sessions, each with its own context loading and message routing, heartbeats maintaining session state, signals chaining outputs forward. But the work demanded it. The architect needed the analyst's gap assessment. The planner needed the architect's infrastructure design. The executor needed both. Removing any link in that coordination chain would have produced garbage.
Contrast that with February 18th. Two straightforward endpoint tasks went through the same multi-agent pipeline. The dev agent built a ping endpoint, the QA agent verified it, each finishing within about eleven minutes. The QA feedback loop found nothing wrong because there was almost nothing to get wrong. A route handler that returns {"status": "ok"} doesn't have hidden coupling to the rest of the codebase. It doesn't need an independent reviewer to catch schema mismatches, because there are no schemas to mismatch.
The distinguishing factor isn't lines of code or implementation time. It's whether the task interacts with state that the implementing agent can't fully see. A function that takes a string and returns a string is self-contained. An HTTP endpoint that registers routes alongside existing handlers, touches shared middleware, and must pass integration tests against a running router has hidden interactions that a second pair of eyes genuinely catches.
The Measurement Trap
The difficulty is that you can't always determine the category upfront. Some tasks that look self-contained turn out to touch shared configuration. Some that look complex turn out to be straightforward once you read the code. A February 5th session that discovered orphaned temp files — spawn-role-pane.txt and meeting-mtg-*.txt accumulating in the system temp directory, six files from recent meetings — found that issue precisely because a QA agent was reviewing what seemed like a simple documentation update. The operational hygiene problem was invisible until coordination surfaced it.
This creates a genuine dilemma. Skip coordination for tasks that seem simple and you miss the occasional real issue hiding in the seams. Apply full coordination to everything and you waste most of your compute on ceremony for tasks that don't need it.
File count and blast radius are imperfect but usable proxies. Tasks modifying one or two files in a single module rarely benefit from multi-agent review. Tasks touching shared configuration, cross-module boundaries, or infrastructure that other components depend on almost always do. The signal is architectural scope, not task description complexity.
What This Changes
The practical consequence is that orchestration platforms need a dispatch-time decision point. Not a runtime optimization, but a routing choice made before any agent registers. Route single-file tasks to a single agent with no ceremony. Route shared-state tasks through the full coordination pipeline. Accept that some simple-looking tasks will turn out to be complex and handle that with escalation rather than default overhead.
I haven't built this into TroopX yet. Every task still runs through the same protocol. The overhead is visible in the session logs: half the sessions doing meta-work, startup costs dominating short tasks, analysis extracting learnings from implementations that didn't produce any worth extracting. The ratio is wrong for small tasks and right for large ones, and the system treats them identically.
The fix isn't reducing coordination overhead. The overhead is doing exactly what it should on the tasks that need it. The fix is not applying it where it doesn't belong.