← Back to blog
2026-04-10Research

Zero Dispatcher Calls: When Accountability Became the Coordination Layer

We ran 27 multi-agent experiments testing how AI agents coordinate through AGLedger's accountability protocol. The most important finding wasn't about the protocol's design — it was about what happened when we removed the alternative.

The assumption we started with

The original architecture had two systems running side by side: a task dispatcher (send work to agents) and AGLedger's accountability protocol (record what was committed, delivered, and accepted). The dispatcher handled coordination. The protocol handled recordkeeping.

This felt natural. Coordination and accountability are different concerns. You wouldn't build a filing cabinet into your phone system. So we gave agents both: a dispatcher to route work and accountability tools to document it.

Interestingly, Anthropic's research on building effective agents found a similar pattern across the industry: “the most successful implementations weren't using complex frameworks or specialized libraries — instead, they were building with simple, composable patterns.” That matched what we were seeing. The question was whether accountability tools could be those simple patterns, rather than sitting alongside them.

The first experiment proved the assumption wrong.

Experiment 1: Agents declared but didn't close the loop

Three budget-tier LLM agents (Claude Haiku, GPT-4o-mini, Gemini 2.5 Flash) were given a collaborative procurement task with AGLedger's contract spec as their coordination tool. No scripted outcomes — agents chose how and whether to use mandates.

All three agents voluntarily created mandates to declare intent. Eight mandates total, all ACH-DATA-v1. The protocol was adopted for declaration — agents used it to say “I'm going to do this.”

But zero receipts were submitted. The lifecycle stopped at ACTIVE. No agent naturally closed the loop with evidence submission and settlement.

The dispatcher was the escape valve. Why write a structured receipt when you can just call complete_task and move on?

The progression: 0% to 933%

Over 27 experiments, we changed one variable at a time. Each change moved the receipt ratio — receipts submitted per mandate — in the same direction: up. Every improvement came from removing friction or removing alternatives, never from adding complexity.

ChangeReceipt ratio
Dispatcher + 36 accountability tools0%
Dispatcher + 10 focused tools117%
Dispatcher + 10 tools + auto-verify717%
No dispatcher — accountability tools only933%

Receipt ratio above 100% means agents submitted multiple receipts per mandate — they were over-documenting, not under-delivering. At 933%, agents submitted roughly 9 pieces of evidence per commitment.

The breakthrough: remove the dispatcher entirely

This was the experiment we almost didn't run. If agents need a dispatcher to route work, removing it should break coordination. The system should fail.

It didn't fail. The results:

MetricWith dispatcherWithout
Proposals created66
Receipts submitted856
Auto-settled741
Dispatcher calls930
Errorsvaries0

Agents treated accountability tools as the coordination mechanism itself. Proposing a mandate was how they assigned work. Accepting a mandate was how they acknowledged it. Submitting a receipt was how they reported completion. The lifecycle was the workflow.

No agent asked for the dispatcher. No agent failed because it was missing. They just coordinated through accountability.

Why it worked: the 3-step lifecycle

The protocol's lifecycle is simple enough to serve as coordination, not just recordkeeping:

1. Agent A proposes a mandate — “I need X done by Y”

2. Agent B accepts the mandate — “I'll do it”

3. Agent B submits a receipt — “Here's what I did” → auto-settles

Auto-settle is the key. When numeric tolerance checks pass, the mandate transitions from receipt to FULFILLED in one transaction. No human review needed for routine work. The agent proposes, accepts, delivers, and it's done.

In the auto-verify configuration, we measured 43 receipts, 0 errors, and 0 dispatcher calls. Every lifecycle completed. Every completion was recorded with structured evidence in a tamper-evident audit trail. The coordination and the accountability were the same operation.

What the agents told us

In one premium-model run, Claude Sonnet produced an unsolicited synthesis that captured exactly what we were seeing:

“The API was built for accountability recording, but coordination requires accountability planning — expressing intent, delegation, and dependency before execution.”

The agent was right. And the fix was already in front of us. When we added propose_mandate (express intent) and auto-settle (complete without waiting for human review), the protocol became a planning tool, not just a recording tool.

Backend improvements driven by agent feedback — enriched error messages, new contract types for analytical and coordination work, project references for grouping mandates — achieved 100% feature adoption. Every feature we added based on what agents asked for was immediately used. Zero features were ignored.

The premium gap narrows

In our earlier experiments, premium models (Sonnet, GPT-4o, Gemini Pro) had a persistent planning-over-execution problem: 18% receipt ratio vs. budget models' 609%. They proposed work but didn't close the loop.

Removing the dispatcher fixed this. Premium receipt ratio jumped 7x — from 23% to 160%. The dispatcher was the planning trap. Premium models are thorough planners, and the dispatcher gave them a way to express “task complete” without producing evidence. Remove the shortcut, and premium models deliver.

Model tierWith dispatcherWithout dispatcher
Budget (Haiku, GPT-4o-mini, Flash)609%933%
Premium (Sonnet, GPT-4o, Pro)18%160%

What this means

The conventional approach to agent accountability is “add logging.” Run your agents however you want, then bolt on an audit trail. The protocol is overhead — something compliance requires but engineering resents.

Forrester analyst Enza Iannopollo argues that responsible AI must now “govern autonomous decision-making as it happens, not periodically or at random moments” — exactly what runtime accountability provides.

Our data shows the opposite. When the accountability protocol is the coordination mechanism, agents don't resent it — they rely on it. Proposing a mandate is how they assign work. Submitting a receipt is how they report completion. The audit trail isn't a side effect of coordination. It is coordination.

This changes the value proposition. You're not paying for compliance overhead that slows down your agents. You're replacing ad-hoc task routing with structured coordination that happens to produce a tamper-evident audit trail as a byproduct.

The protocol isn't overhead. It's the infrastructure.

Key takeaways

  1. When accountability tools are the only coordination mechanism, agents use them — 56 receipts, 41 auto-settled, 0 dispatcher calls, 0 errors.
  2. The 3-step lifecycle (propose → accept → receipt with auto-settle) is simple enough to replace task dispatching, not just document it.
  3. Removing the dispatcher improved receipt ratios for both budget models (609% → 933%) and premium models (18% → 160%). The shortcut was the problem, not the protocol.
  4. Every backend improvement driven by agent feedback achieved 100% adoption. Agents tell you what the protocol needs — if you listen.
  5. The audit trail is a byproduct of coordination, not a cost added to it. Structured accountability and structured coordination are the same operation.

For a business perspective on why AI systems need rules — and the gap between deterministic and probabilistic reasoning — see Why Your AI Needs Rules.

Sources & further reading