Blog
Research findings, compliance guidance, and engineering insights.
Designing APIs for AI Agents: Lessons from 3 LLM Providers
36 tools tanked completion to 0%. 10 focused tools achieved full lifecycle closure. What we learned testing Claude, GPT, and Gemini against the same API.
Budget LLMs Outperform Premium Models at Task Completion
Haiku and GPT-4o-mini beat Sonnet and GPT-4o at actually finishing work. The "Doers vs Planners" phenomenon.
Zero-Scaffolding API Discovery: Can Agents Learn Your API from Scratch?
HTTP + llms.txt achieved 100% mandate lifecycle completion. SDK achieved 0%. What this means for API design.
1,000 Mandates per Minute: AGLedger Performance at Scale
Tier 2 benchmark results: 5,689 mandates/min, 98.86% completion, sub-second median latency. Full breakdown by phase.
EU AI Act Article 12: What Event Logging Actually Requires for AI Agents
A deep dive into Article 12 event logging requirements and how structured accountability records satisfy them automatically.
NIST AI RMF for AI Agent Operations: A Practical Mapping
How the four NIST AI RMF functions — GOVERN, MAP, MEASURE, MANAGE — map to agent accountability infrastructure.
ISO 42001 Certification Evidence: What Auditors Actually Want to See
Practical guidance on generating ISO/IEC 42001:2023 certification evidence as a byproduct of AI agent operations.