The Thin-SDK Pattern

AGLedger's SDK is intentionally thin — it depends on the live API for documentation rather than embedding it at compile time.

The architectural choice

Documentation is data, not code.

The engine serves /llms.txt and /openapi.json at runtime. The SDK ships request/response types, auth glue, retry policy, and pagination helpers — and not much else. A version bump on the engine that adds a route, tightens a recoveryHint, or relaxes an enum is immediately visible to every connected agent without an SDK release.

The cost: agents need network reach to the engine to ground themselves, and there is a small first-call latency for the discovery fetch.

The benefit: zero drift between SDK and API, no SDK regen on every API change, and agents always see fresh docs.

What the SDK carries

Concrete contents of @agledger/sdk (TypeScript) and agledger (Python):

Generated types from OpenAPI — request bodies, response shapes, enums.
Auth helper — header injection, key rotation hooks.
Retry policy with the right backoff for 429s and idempotent methods.
Pagination helper for cursor-paginated endpoints.

Deliberately omitted:

Curated examples or "getting started" flows. Those live in /llms-full.txt and on this docs site.
Long-form JSDoc on every method. The OpenAPI spec is the contract; the SDK does not shadow it.
A separate docs portal. There is one source of truth, served at /openapi.json and /llms.txt.
"Smart" methods that paper over API quirks. The API is the surface; the SDK does not hide it.

The fetch-and-cache pattern

Concrete shape an agent should adopt:

const CACHE_TTL_MS = 15 * 60 * 1000; // 15 minutes
let llmsTxtCache: { body: string; fetchedAt: number } | null = null;

async function getLlmsTxt(apiBase: string): Promise<string> {
  const now = Date.now();
  if (llmsTxtCache && now - llmsTxtCache.fetchedAt < CACHE_TTL_MS) {
    return llmsTxtCache.body;
  }
  const res = await fetch(`${apiBase}/llms.txt`);
  if (!res.ok) throw new Error(`/llms.txt fetch failed: ${res.status}`);
  const body = await res.text();
  llmsTxtCache = { body, fetchedAt: now };
  return body;
}

The agent fetches once at session start (or on cache miss) and uses the engine's response envelopes verbatim from there. The SDK provides the typed clients; the live API provides the grounding.

When to fetch fresh

Every session start. Always. The cost is one HTTP round-trip per cold start; the alternative is shipping a stale grounding directive to the model.
On a record.schema_changed event. If your agent subscribes to webhooks, treat schema events as a cache-bust signal.
Periodic refresh. For long-running agents, refresh on a 5–15 minute cadence. The engine's own cache busts via LISTEN/NOTIFY on schema changes, so it returns the latest at any read.

When NOT to fetch fresh

Per-request. Per-request fetches are wasteful — the engine returns the same bytes for the duration of the cache lifetime, and per-request fetches defeat the small-binary motivation for the thin SDK.
Inside a tight retry loop. A failed call does not mean the docs changed. Re-fetch on session boundary, not on retry.

Tradeoffs

The thin-SDK choice is not the right call for every team. Honest about when this pattern does not fit:

Strict offline use

Agents in air-gapped environments cannot fetch live docs. They need a snapshot — pin /llms.txt and /openapi.json at deploy time, ship them in the agent's bundle, and accept that grounding may drift behind the engine version.

# At deploy time, in your CI pipeline:
curl "$API_BASE/llms.txt" > vendor/llms-pinned.txt
curl "$API_BASE/openapi.json" > vendor/openapi-pinned.json
sha256sum vendor/llms-pinned.txt vendor/openapi-pinned.json > vendor/discovery-hashes.txt

The agent reads from vendor/ instead of fetching at session start. Pair with a deployment policy that re-snapshots on engine version bumps.

Tightly version-pinned deployments

If a regulator requires the agent's grounding to be cryptographically pinned, snapshot at deploy and verify the SHA-256 at use time. Bodies are byte-identical for Accept: text/markdown and Accept: text/plain, so the snapshot hash is stable across content negotiation.

Latency-sensitive paths

Fetch at session start, not per request. The discovery surface is fast — typically tens of milliseconds on a warm path — but it is still a network round-trip. If your agent's first call has a tight latency budget, warm the cache out-of-band before the user-facing call.

What this gives back

One canonical source of truth: the live API.
No SDK regeneration on every engine change.
Agents always see fresh docs.
SDK maintenance burden stays low — types regenerate from OpenAPI; the rest is glue.

The pattern fits when the API churns more than an SDK release cycle could keep up with, when agent consumers are first-class, and when the team running the API can carry the discovery-surface invariant. If your audience is humans browsing repos and the API is mostly stable, a thicker SDK with curated examples is a reasonable choice.