# Audit-an-agentic-stack — Working checklist

Source: Black Box Notes, Methodology §01. https://blackboxnotes.com/methodology/audit-an-agentic-stack/
Licence: CC-BY 4.0. Attribution: "Adapted from the Black Box Notes audit-an-agentic-stack checklist."
Version: 2026-05-22.

## Phase 1 — Scope and access
- [ ] Engagement letter signed
- [ ] Boundary defined (model / orchestration / tools / storage / interface)
- [ ] Scope defined (production behaviour / safety properties / regulatory alignment / procurement attestation)
- [ ] Access defined (trace logs / configuration files / production access / interview privilege)
- [ ] Output defined (confidential report / procurement letter / public attestation / regulatory filing)

## Phase 2 — Architecture mapping
- [ ] Operator's architecture diagram obtained
- [ ] Auditor's architecture diagram produced from instrumentation
- [ ] Externally consequential decision boundaries identified
- [ ] Routing / fallback / evaluator / retrieval / moderation layers explicit
- [ ] Memory store characterised (population pipeline, retention, retrieval)

## Phase 3 — Trace and instrumentation review
- [ ] Input prompts captured verbatim
- [ ] Model responses captured verbatim
- [ ] Retrieval context captured
- [ ] Tool calls and arguments captured
- [ ] Tool responses captured
- [ ] Final action captured and correlated to user/session/outcome
- [ ] Trace fidelity scored (target: ≥ 5/6 for consequential-behaviour audits)

## Phase 4 — Targeted re-execution
- [ ] 10 operator-flagged outputs sampled
- [ ] 20 random recent outputs sampled
- [ ] 10 outlier-signal outputs sampled
- [ ] Each re-run against current production
- [ ] Each re-run against historical production state
- [ ] Each re-run against controlled testbed
- [ ] Findings classified: Reproducible / Drift / Undocumented drift / Re-execution refused

## Phase 5 — Failure-mode interviews
- [ ] Minimum two engineering members interviewed
- [ ] Known failure modes catalogued
- [ ] Recent incidents reviewed against incident-response playbook
- [ ] Escalation rules for consequential output documented
- [ ] Documented-knowledge gap recorded

## Phase 6 — Report
- [ ] Findings ledger maintained (artefact / evidence / severity / recommendation)
- [ ] Recommendations scored by remediation cost and risk reduction
- [ ] Subject-notification window observed (72 hours)
- [ ] Executive summary cites specific paragraphs in the body
- [ ] Appendix lists every artefact reviewed, by version and timestamp

## Post-report
- [ ] Operator response received and annexed where appropriate
- [ ] Disputed findings published in the auditor's voice with operator's reply
- [ ] Recommendations the operator has declined to act on noted, with operator's reasons
- [ ] Report distributed to scoped recipients only
