Recovery

Recovery for a tamper-evident chain is restore, then verify — never rewrite. A restore is not done when the data is back; it is done when the chain re-verifies. This runbook pairs with the backup runbook.

Restore the database

deploy/scripts/restore.sh takes a backup tarball, drops and recreates the database, and runs pg_restore:

./deploy/scripts/restore.sh ./backup/backup-2026-05-26-013327.tar.gz

Under the hood it recreates the database, runs pg_restore --no-owner --no-acl, and logs completion:

Restore complete.

The restored database now holds what the backup captured — for the chain used throughout this runbook, 4 records across a 5-entry audit_vault chain and 2 signing keys (1 active, 1 rotated-out retired key). The verify step below confirms those counts.

For an external database the script uses pg_restore against your DATABASE_URL directly; the user must have CREATEDB. After it returns, bring the Server up and wait for the readiness gate before trusting anything:

curl -s "$AGLEDGER_URL/readyz"
{"status":"ready","version":"0.25.4","timestamp":"2026-05-26T01:21:29.996Z"}

A ready Server is serving. It is not yet a verified Server.

Verify the restored chain — the step that ends the restore

Prove the chain came back intact before you declare recovery complete. Run the connected check, then the authoritative offline verifier.

The in-engine check walks every per-record chain against the database:

DATABASE_URL=postgresql://… pnpm vault:verify
Verifying 5 record(s)...

[PASS] 019e61d4-fbf7-7205-8e4c-b17baf4b6f64 (1 entries)
[PASS] 019e61d4-fc62-7ada-bd58-acca32088e09 (1 entries)
[PASS] 019e61e2-d345-76eb-bd3c-93c72abcd7cb (1 entries)
[PASS] 019e61d4-fc48-7b89-abc8-3546bee8bb6b (1 entries)

5 record(s) checked, 0 error(s)

Then the offline verifier — the authoritative proof, which also checks Ed25519 signatures and needs nothing but the dump (see the audit runbook):

DATABASE_URL=postgresql://… pnpm vault:dump ./dump && agledger-verify ./dump
[PASS] AGLedger offline verification

audit_vault chain
  records    : 5
  entries     : 5
  failures    : 0

Both clean across records signed by the active key and a rotated-out retired key. The restore is now complete.

When verification reports a failure

A failure is information, not a dead end. The offline verifier reports CHAIN_-prefixed integrity classes (see the audit runbook for the full list); the class tells you which kind of problem you have:

| Class | What it means | Restore went wrong, or real problem? | |---|---|---| | CHAIN_POSITION_GAP | A chain position is missing | Usually a partial/interrupted restore — re-restore from a complete backup | | CHAIN_LINK_BROKEN | An entry's previous_hash does not match the prior entry | Partial restore, or a backup taken mid-write — re-restore | | CHAIN_GENESIS_INVALID | The first entry does not start at genesis | Truncated restore — re-restore | | CHAIN_HASH_MISMATCH | A stored hash does not match sha256(cose_sign1) | If the backup itself verifies clean, the restore corrupted bytes — re-restore. If the backup also fails here, the backup faithfully captured a real tamper you must investigate |

The discriminator: verify the backup (its NDJSON dump) independently. If the backup verifies and the restore does not, the restore is at fault — repeat it. If the backup itself fails, recovery will not paper over it; you have a real integrity problem from before the backup was taken.

Signing-key recovery

The vault_signing_keys registry is part of the database dump, so it returns with the restore — the 2 signing keys above are the active key plus a rotated-out retired key, both back. Records signed under the retired key still verify, because the engine resolves historical keys from the registry.

Restoring the registry is not the same as restoring the ability to sign. New records need the private key in VAULT_SIGNING_KEY (and VAULT_SIGNING_KEY_PREVIOUS), which lives in your secret store, not the backup. Restore the database and set those env vars; on boot the engine reconciles the registry to the configured key (see the day-2 operations runbook). If the registry row is ever lost, the VAULT_SIGNING_KEY env var is the fallback the engine bootstraps from.

What recovery cannot do

You cannot rewrite history to "fix" a chain. A record that was lost before the backup shows up as a chain gap (CHAIN_POSITION_GAP), and that gap is the honest record of what happened — there is nothing to restore it from and nothing legitimate to paper over it with. Recovery brings back what the backup holds and proves it; it does not manufacture entries.

Multi-Server recovery

Each Server owns its own database and restores its own slice independently — there is no shared store to coordinate. Cross-Server delegation chains re-link by signature reference, not by restoring a common database: once each Server is restored and verifying, the links across them resolve through the published public keys. Restore and verify Server by Server; the federation re-forms on its own.


Validated against API v0.25.4 on 2026-05-25.