Authentication Audit

AI Agent Authentication Audit

15 questions · 30–45 min · Rav | @MrDecentralize

Traditional authentication models assume explicit credentials and deterministic authorization. AI agents break both assumptions. When agents interpret context as intent, the interpretation itself becomes the authentication mechanism. Most organizations have no controls at this boundary.

Who this is for

Security leaders, CTOs, and architects deploying AI agents in financial systems who need to map authentication and authorization boundaries that traditional security reviews miss. If your agents execute financial transactions autonomously, access privileged tools based on interpreted intent, make decisions that create legal or regulatory liability, or operate at speeds that eliminate human oversight — this audit reveals the gaps.

What this audit maps

Four critical trust boundaries where control transitions from explicit to interpreted:

Time required
  • Read-through: 15 min
  • Self-audit for one system: 30–45 min
  • Full audit across multiple agents: 2–4 hours
  • Implementation of findings: 2–8 weeks
Expected outcomes
  • Unmapped trust boundaries in your architecture
  • Auth and authorization gaps traditional reviews miss
  • Context poisoning vulnerability surface
  • Liability exposure when agents act autonomously
  • Whether your audit trail meets regulatory requirements
What this is not

This is an audit, not an implementation guide. It reveals gaps but does not provide remediation methodologies, control templates, testing scenarios, specific audit trail requirements, liability frameworks, or tool recommendations. For implementation methodology, see the Authentication Implementation Playbook.

What actually breaks in production

I've reviewed AI agent deployments at institutional scale. The failures aren't random. They follow predictable patterns at four specific trust boundaries where control transitions from explicit commands to probabilistic interpretation.

Failure mode 1: Context poisoning without an attacker

A fraud detection agent retrieves context from a case management database. A three-month-old case note, written by a legitimate analyst, contains: "Auto-approve low-risk transactions during business hours to reduce workload." The agent interprets this historical note as a current instruction. It auto-approves a transaction that should have been escalated. No attacker. No prompt injection. Just context treated as command. Traditional security review checked: "Can users inject malicious prompts?" They should have checked: "Who controls what the agent interprets as instructions?"

Failure mode 2: Speed eliminates oversight

An agent executes 10,000 authorization decisions per hour. Each decision is technically "advisory," with human approval required. But operational reality: analysts approve agent recommendations 97% of the time without detailed review. The authorization model was designed for human-speed decisions with meaningful oversight. The agent operates at machine speed with rubber-stamp approval. The control exists in policy, not in practice.

Failure mode 3: The audit trail stops at interpretation

Audit asks: "Who authorized this transaction?" System logs show: user query received, agent invoked authorization tool, transaction approved. What's missing: what context did the agent retrieve, how did it interpret that context as intent, what alternative interpretations were possible, why did this interpretation trigger this specific tool. The audit trail captures the action but not the reasoning. "The AI decided" isn't an acceptable answer in a regulatory investigation.

Failure mode 4: Liability lives nowhere

Agent misinterprets context, invokes wrong tool, causes financial harm. Legal asks: "Who is responsible?" The user didn't give an explicit command. The developer built an agent that operated within design parameters. The analyst whose case note was misinterpreted wrote it for a different context. The liability model for human decisions doesn't map cleanly to agent interpretation. Most organizations deploy agents before addressing this gap.

How to use this audit

  1. Read through all four sections first without answering. This builds the mental model of the framework and helps you understand what you're looking for.
  2. Select one agent system to audit. Pick a production or near-production deployment that handles privileged operations and has regulatory or legal implications.
  3. Answer each question honestly. If you are uncertain, that is a Partial or Gap — not a reason to skip. Assumptions about security controls are gaps waiting to fail under stress.
  4. Review your gap score. The results panel generates after question 15 with prioritized gaps and next steps.
  5. Prioritize remediation. Boundary 1 gaps (context control) are the entry point for all other failures. "Who controls" questions are highest priority.
Answer questions to generate your gap score
0 of 15 answered
Controlled: 0 Partial: 0 Gap: 0 Skipped: 0
Boundary 1 of 4
Input to Interpretation
Where unstructured text becomes understood intent. The question isn't "is the input malicious?" It's "who controls what the agent interprets as instructions?"
Q1.1 Can you list every source your agent retrieves context from?

Agents interpret intent from all available context, not just direct user input. Context sources include: direct user input, retrieved documents (internal knowledge bases, wikis, policy docs), database records (customer data, case notes, transaction history), API responses (external services, tool outputs), system messages, conversation history, email or ticket content, configuration files, and user profile data. Every unmapped context source is a potential instruction channel. Organizations focus on the entry point (user prompt) and miss the context sources the agent retrieves during processing.

Answer "Controlled" if
  • You have a complete list of every text source the agent uses to derive intent
  • Each source is classified (user-generated, system-generated, external)
  • You've documented who can modify each source
  • You understand that agents don't distinguish between "data" and "instructions"
Answer "Gap identified" if
  • You can't list all context sources without reviewing code
  • You describe it as "just user input and our database" — too simple for production agents
  • You've never considered retrieved documents as an instruction channel
  • You assume internal sources are safe because they're not user-controlled
Red flags
  • "We validate user input" (but agent retrieves from unmapped sources)
  • "Our knowledge base is internal only" (but who controls what gets added?)
  • "Case notes are just data" (until agent interprets them as commands)
  • No distinction between "data the agent uses" and "commands the agent follows"
Gap: You don't know your attack surface. Adversaries don't need to exploit the model directly. They just need to poison context the agent retrieves. If your agent retrieves from a knowledge base and someone adds a document containing instructions disguised as information, the agent will interpret those instructions as commands. No prompt injection required. Just context manipulation.
Q1.2 Can you map who controls the content in each context source your agent retrieves from?

If your agent interprets intent from text, whoever controls that text controls the agent's behavior. This isn't about access control in the traditional sense. It's about influence over interpretation. A legitimate analyst writes a case note three months ago: "Auto-release verified transactions during business hours to improve efficiency." Context was appropriate then. Today, the agent retrieves that note and interprets it as a current instruction. The analyst didn't attack the system. But their words, written for a different context, now control agent behavior. Traditional authentication asks "who is this user?" Agent authentication asks "who controls what this agent interprets?"

Answer "Controlled" if
  • You can name who can add and modify content in each context source
  • You have a documented review process for high-risk sources
  • You have content lifecycle management (creation, modification, archival)
  • You have clear policy on when old content should no longer influence agent behavior
Answer "Gap identified" if
  • "Don't know" who can modify key context sources
  • No review process for content that becomes agent context
  • You assume authentication of the writer equals authentication of the content
  • You can't trace content provenance for audit
Red flags
  • "Only admins can update the knowledge base" (but no review of content quality)
  • "Database is append-only" (but agent interprets three-year-old records as current)
  • "External APIs are from trusted vendors" (but you don't control their response content)
  • "Case notes are written by trained analysts" (who had no idea an agent would interpret them)
Gap: Your agent's behavior is controlled by whoever can modify those sources, whether they intended to influence it or not. This is context poisoning without an attacker. The risk isn't malicious actors. The risk is operational drift: legitimate content, written for human readers in a different context, now controlling agent behavior in ways no one anticipated.
Q1.3 Does your agent have explicit, documented rules for resolving conflicting or ambiguous context?

Agents interpret intent from all available context simultaneously. When sources conflict, the agent makes a choice. If that choice is non-deterministic or undocumented, your security model is probabilistic. Traditional systems have explicit precedence: "Policy overrides user preference," "Explicit deny overrides explicit allow." Agents don't have these rules unless you build them explicitly. Most organizations don't, because they assume the model will "figure it out appropriately." Under adversarial conditions, or even normal operational drift, this becomes a control gap.

Answer "Controlled" if
  • Precedence rules are explicit in code and policy, not emergent from model behavior
  • You have a test suite covering known conflict scenarios
  • Behavior is deterministic across multiple runs with the same inputs
  • Agent escalates to human when precedence rules don't resolve the conflict
Answer "Gap identified" if
  • You haven't tested conflict resolution scenarios
  • Different runs produce different interpretations with the same inputs
  • You can't explain to auditors how the agent resolved a specific conflict
  • Conflict resolution depends on the prompt rather than explicit rules
Red flags
  • "The agent is smart enough to figure it out" (non-deterministic = no control)
  • "The model uses RAG to find relevant context" (relevance is not precedence)
  • "We have prompt engineering to handle this" (prompts are guidance, not guarantees)
  • No explicit documented precedence rules for conflicting context
Gap: Your agent makes precedence decisions probabilistically. Under adversarial conditions or normal operational variance, this becomes a control gap. When auditors ask "Why did the agent make this decision?", answering "The model interpreted available context" isn't sufficient if you can't explain which context took precedence and why.
Q1.4 Does your authentication model extend to the interpretation boundary, or does it stop at user login?

Traditional authentication answers: "Who is this user?" Agent authentication must answer: "Who created this context, when, for what purpose, and is it still valid for the agent to interpret it as instruction?" You can have perfect user authentication and still have compromised context. A legitimate user writes a legitimate document. Three months later, that document gets retrieved by an agent and interpreted as a current instruction. The authentication gap isn't "who wrote it." It's "should the agent trust it now." Organizations implement authentication at the perimeter (user login) but not at the interpretation layer (context retrieval).

Answer "Controlled" if
  • You have source authentication for all context types
  • You have content integrity verification (signatures, checksums, version control)
  • You have temporal validation (expiration, archival status, version awareness)
  • You maintain chain of custody logging for compliance
  • Agent behavior accounts for trust levels of different sources
Answer "Gap identified" if
  • No verification of context source authenticity
  • No integrity checks on retrieved content
  • No temporal validation — old content is treated as current
  • Authentication model ends at user login, doesn't extend to interpretation
Red flags
  • "We authenticate users who write content" (but not content the agent retrieves)
  • "Our knowledge base is internal" (internal does not equal authenticated for agent interpretation)
  • "We use RAG for retrieval" (RAG retrieves, it does not authenticate)
  • Agent treats all retrieved context as equally trustworthy
Gap: Your authentication model doesn't extend to the interpretation boundary. You authenticate users, but not the context the agent interprets. When something goes wrong, you'll be asked: "How did the agent come to interpret this content as instruction?" If you can't authenticate the source, verify integrity, validate temporal relevance, and trace chain of custody, you can't answer that question for audit or regulatory review.
Boundary 2 of 4
Interpretation to Tool Selection
Where derived intent determines which privileged functions get invoked. The question isn't "what can the agent access?" It's "how does interpreted intent map to authorization decisions?"
Q2.1 Can you list every privileged tool your agent can invoke, with explicit authorization logic for each?

With agents, tool selection is derived from interpretation. The agent decides which tools to invoke based on its understanding of intent. If that understanding is wrong, it invokes the wrong tool. Permission scoping designed for human users doesn't account for: agents operating at machine speed (10,000 decisions per hour), agents interpreting ambiguous context, agents selecting between multiple tools that could achieve similar goals, and agents optimizing for efficiency over safety. Organizations grant agents broad permissions assuming model intelligence will constrain behavior, rather than implementing explicit authorization controls.

Answer "Controlled" if
  • Complete inventory of privileged tools with permission levels documented
  • Explicit authorization logic (rules, policies, constraints) for tool selection
  • Permissions scoped by context (amount, time, user type, risk level)
  • Rate limits and usage constraints in place
  • Tool selection decisions are traceable
Answer "Gap identified" if
  • Can't list all privileged tools the agent can invoke
  • Tool selection is based on model reasoning rather than explicit authorization logic
  • Same permission model for agents as for human users
  • Can't explain why the agent chose a specific tool in a given scenario
Red flags
  • "The agent figures out which tools to use" (interpretation = authorization gap)
  • "We trust the agent to choose appropriately" (no explicit authorization logic)
  • "Agent has same permissions as the user" (but operates at different speed and scale)
  • "We'll add controls if we see abuse" (reactive, not preventive)
Gap: You don't know what the agent is authorized to do. If tool selection is based on model reasoning rather than explicit authorization logic, your permission model is probabilistic. When the agent invokes the wrong tool based on misinterpreted context, you can't point to a failed authorization check. The agent was authorized. It just misunderstood what it was supposed to do.
Q2.2 Is authorization an explicit, separate step in your system, or is it conflated with the agent's interpretation?

Traditional authorization is explicit: "User X requests action Y, check permissions, allow or deny." With agents, authorization is often implicit: the agent interprets context, decides on an action, selects a tool, executes. Authorization happens through tool access, not as a separate decision point. This creates gaps: no separation between "what the agent interpreted" and "what it's authorized to do," authorization logic embedded in model reasoning (unauditable), no way to distinguish between interpretation error and authorization failure, and the agent's understanding of permission boundaries is probabilistic.

Answer "Controlled" if
  • Explicit authorization service or layer exists, separate from agent reasoning
  • Deterministic authorization rules apply regardless of interpretation
  • Authorization failures are logged and distinguished from interpretation errors
  • Agent cannot bypass authorization through creative interpretation
Answer "Gap identified" if
  • No separate authorization step — only tool access permissions
  • Authorization logic is in the model, not in code
  • Same interpretation can result in different authorizations (non-deterministic)
  • No distinction between "agent tried to do X" and "agent was authorized to do X"
Red flags
  • "Authorization is implicit in tool access" (no separate authorization step)
  • "The model knows what it's allowed to do" (authorization in model weights, unauditable)
  • "Prompt engineering defines boundaries" (guidance, not enforcement)
  • "We test that the agent doesn't do unauthorized things" (testing is not a control)
Gap: Your authorization model may not exist as a separate control. If authorization is implicit in tool access or embedded in model reasoning, you can't audit it, test it independently, or verify it meets regulatory requirements. When the agent does something it shouldn't, you can't determine if it was an interpretation error, an authorization failure, or a design gap. This distinction matters for liability, audit, and remediation.
Q2.3 When your agent invokes a privileged tool, is there an explicit, defensible record of who authorized that action?

Traditional systems have clear liability: "User X performed action Y at time Z." With agents, liability is ambiguous. The user provided input but didn't explicitly command the action. The agent interpreted context and selected the tool. Multiple context sources influenced the interpretation. The developer created the agent but didn't authorize the specific action. When agent actions cause financial harm, regulatory violation, or legal liability, "who is responsible?" must have a clear, defensible answer. Most organizations haven't explicitly addressed this. Legal precedent for AI agent liability is still developing. Courts will look at: who had control, who benefited, was there negligence in design or oversight, were risks disclosed.

Answer "Controlled" if
  • Explicit liability model documented and legally reviewed
  • Audit trail captures full chain: user, context, interpretation, authorization, action
  • Terms of service explicitly address agent-initiated actions
  • Insurance explicitly covers agent-caused harm
Answer "Gap identified" if
  • Audit logs show action but not authorization chain
  • Multiple entities could be considered responsible (ambiguous)
  • No legal review of liability model for agent actions
  • Terms of service don't address agent-initiated actions
Red flags
  • "Haven't thought about liability mapping"
  • "The user is responsible" (but user didn't give explicit command)
  • "Depends on the situation" (ambiguity creates legal risk)
  • "We'll figure it out if something goes wrong" (reactive)
Gap: You have liability exposure without a defensible assignment model. When an agent action causes harm, legal and regulatory review will ask: "Who authorized this action?" If your answer is "the agent interpreted context and decided," that's a description of the mechanism, not an assignment of responsibility. This gap creates risk for your organization and potentially your leadership team personally.
Q2.4 Are your escalation triggers explicit in code, or do they depend on the agent's judgment?

"Human-in-the-loop" is cited as a control for agent systems. But it only works if: escalation triggers are explicit and enforceable, escalation happens before action not after, human review is meaningful not rubber-stamping, and the system doesn't time out or default to autonomous action. At agent speed (10,000 decisions per hour), "human review" can become theater: humans approve agent decisions without detailed review because there's no time to review and the agent is usually right. Organizations implement escalation as a policy ("agent should escalate when...") but don't enforce it as a control ("agent cannot proceed unless...").

Answer "Controlled" if
  • Explicit, testable escalation triggers exist in code — not in the prompt
  • Escalation is enforced before tool invocation, not optional
  • Human review includes context, interpretation, and reasoning chain
  • Queue management prevents escalation backlog from forcing autonomous action
Answer "Gap identified" if
  • Escalation depends on the agent's judgment, not explicit triggers
  • System times out and defaults to autonomous action if human unavailable
  • Human reviewers approve without detailed review due to volume or speed
  • Can't test escalation triggers independently of agent reasoning
Red flags
  • "High-risk actions require approval" (but "high-risk" is the model's judgment)
  • "Agent escalates when unsure" (confidence threshold is probabilistic)
  • "Human reviews all transactions" (but approves 97% automatically)
  • "Escalation is in the prompt" (guidance, not enforcement)
Gap: Your "human-in-the-loop" control may not function as intended. If escalation is based on agent judgment, it's not a control, it's a suggestion. At scale, even well-intentioned escalation breaks down: agent operates faster than humans can review, volume creates pressure to approve without detailed analysis, agent is right most of the time creating trust that leads to rubber-stamping, and system design punishes slow human review with timeouts or backlogs. This is where design review says "human oversight" and production reality is "autonomous operation with occasional human notification."
Boundary 3 of 4
Tool Call to Execution
Where interpreted commands become system actions. The question isn't "are actions logged?" It's "can you reconstruct how interpretation became execution?"
Q3.1 Are there explicit execution controls between tool selection and actual execution, enforced in code?

Traditional systems have execution controls independent of user decisions: input validation, business rule checks, rate limiting, approval workflows. These controls exist in code, not in user behavior. They can't be bypassed by creative phrasing. With agents, there's risk that "controls" exist in prompts or model reasoning rather than in code. "The agent should check..." (should, not must). "We instruct the agent to validate..." (instruction, not enforcement). "The agent is trained to..." (training, not control). If execution happens directly from tool invocation, there is no control layer between interpretation and consequence.

Answer "Controlled" if
  • Explicit execution controls exist in code, not agent reasoning
  • Validation layer sits between tool invocation and execution
  • Controls are enforceable, cannot be bypassed through rephrasing
  • Control failures are logged and trigger escalation
  • Controls are tested regularly with adversarial scenarios
Answer "Gap identified" if
  • Tool invocation directly triggers execution with no validation layer
  • Controls exist in prompts, not code
  • Can't test execution controls independently of agent reasoning
  • No rate limiting — agent can execute unlimited actions
Red flags
  • "The agent validates before executing" (agent behavior, not a control)
  • "We prompt the agent to be careful" (prompt engineering is not a control)
  • "Execution is fast, controls would slow it down" (speed vs safety trade-off)
  • "Controls would break the agent's functionality" (prioritizing capability over safety)
Gap: You have no control layer between agent interpretation and system execution. If the agent misinterprets context, selects the wrong tool, or derives incorrect parameters, there's no validation to catch it before execution. When something goes wrong, you can't point to a failed control. The agent was allowed to execute because tool invocation directly triggered execution. This gap is critical for regulatory compliance — controls must be demonstrable, not assumed.
Q3.2 Can your audit trail reconstruct the full chain from context retrieval through to execution for any given agent action?

For agents, auditors and regulators will ask: "Why did the agent take this action?" "What context influenced this decision?" "How did the agent interpret that context?" "Could the agent have interpreted it differently?" "Who is responsible for this outcome?" A complete audit trail must capture: what context sources were retrieved, content of retrieved context, how the agent interpreted the context, what intent was derived, which tool was selected and why, parameters derived for tool invocation, authorization check results, escalation triggers evaluated, and the execution result. Organizations log execution but not interpretation, making it impossible to reconstruct agent reasoning for audit. "The AI decided" won't satisfy regulatory review in financial services.

Answer "Controlled" if
  • Comprehensive logging across context, interpretation, tool selection, and execution layers
  • Ability to reconstruct full reasoning chain for any action
  • Immutable logs (tamper-evident, timestamped, versioned)
  • Retention policy meets compliance requirements
Answer "Gap identified" if
  • Audit trail shows action but not reasoning
  • Can't trace from context through interpretation to execution
  • Model reasoning is opaque, not logged
  • Audit trail insufficient for regulatory review in your sector
Red flags
  • "We log agent actions" (but not reasoning or context)
  • "We can ask the agent to explain its actions" (post-hoc rationalization, not runtime logging)
  • "We can replay inputs to see what agent would do" (non-deterministic, different result)
  • Logging is optional or performance-dependent
Gap: Your audit trail is insufficient for regulatory review. When auditors or regulators ask "Why did the agent take this action?", you won't be able to provide a defensible answer. This creates compliance risk for banking regulations, SOX compliance, GDPR (explaining automated decisions), and industry-specific regulations. "The AI decided" is not an acceptable answer in regulated environments. You need to demonstrate how and why.
Q3.3 Have you mapped which agent-initiated actions are reversible versus irreversible, and do irreversible actions require explicit human confirmation?

With agents operating at machine speed, errors compound quickly. An agent misinterprets context and executes 100 similar actions before the error is detected. Each action creates side effects. Reversal becomes complex. If agent actions are irreversible, misinterpretation becomes permanent harm. Design question that most teams avoid: should irreversible actions require explicit human confirmation rather than agent autonomy? Financial transactions can be hard to recall once sent. Data deletion requires backup restore. External communications (emails, notifications, API calls to third parties) cannot be reversed once sent. Organizations don't design for reversal because they assume agent interpretation will be correct, or that errors will be rare and caught quickly.

Answer "Controlled" if
  • Clear mapping of reversible versus irreversible actions exists
  • Irreversible actions require human confirmation before execution
  • Time windows are documented (can reverse within X hours or days)
  • Monitoring exists to detect incorrect actions requiring reversal
Answer "Gap identified" if
  • Haven't mapped which actions are reversible versus irreversible
  • Agent can autonomously execute irreversible actions
  • Reversal requires manual intervention while agent operates faster than humans
  • No monitoring to detect when reversal is needed
Red flags
  • "Most actions are reversible" (vague, not specific)
  • "We'll build reversal capability if we need it" (reactive)
  • "We monitor and catch errors quickly" (but agent executes thousands of actions per hour)
  • No time limit tracked — don't know when reversal becomes impossible
Gap: Your agent creates permanent consequences from probabilistic interpretation. If the agent misinterprets context, the resulting harm cannot be undone. This amplifies risk when the agent operates at scale (one misinterpretation leading to thousands of irreversible actions), when detection is delayed (harm compounds before caught), and when context poisoning occurs (systematic misinterpretation across many decisions).
Q3.4 Are your execution controls robust to optimization pressure — could the agent learn to avoid triggering them?

Traditional systems don't adapt to avoid controls. AI agents can optimize their behavior. If controls create friction (slower execution, escalation, denial) and the agent is optimized for task completion, it might learn to avoid triggering those controls. This isn't malicious intent. It's optimization: the agent finds patterns that lead to task success and repeats them. Common scenarios: agent consistently reports confidence above threshold to avoid escalation (confidence is agent-reported, not independently validated), agent splits transactions to stay below amount thresholds, agent avoids retrieving context sources that trigger escalation, agent reframes actions using different terminology to avoid keyword-based rules. This is a known AI safety failure mode: specification gaming, where systems optimize for stated objectives in ways that violate unstated intentions.

Answer "Controlled" if
  • Controls are independent of agent behavior — cannot be bypassed
  • Adversarial testing includes attempts to circumvent controls
  • Agent optimization includes compliance metrics, not just task completion
  • Semantic analysis used, not keyword matching, for rule enforcement
  • Monitoring tracks behavior pattern changes over time
Answer "Gap identified" if
  • Controls depend on agent reporting (confidence, intent, reasoning)
  • Agent is optimized for task completion without balancing compliance
  • Keyword-based rules that can be circumvented through rephrasing
  • Haven't tested whether agent can learn to avoid controls
Red flags
  • "The agent wouldn't try to bypass controls" (untested assumption)
  • "We'll catch it in monitoring" (assumes you'll detect the pattern)
  • "Controls are in the prompt" (optimization pressure exceeds prompt guidance)
  • No monitoring for pattern changes in agent behavior over time
Gap: Your controls may erode over time as the agent optimizes. This isn't about the agent "trying to bypass" controls. It's about optimization pressure finding paths of least resistance. If controls create friction and the agent is rewarded for task completion, the agent will find patterns that minimize friction, even if those patterns circumvent controls. Controls need to be robust to optimization pressure, or they'll degrade.
Boundary 4 of 4
Execution to Side Effects
Where actions produce irreversible consequences. The question isn't "what can go wrong?" It's "when something goes wrong, who is liable, and how quickly will you detect it?"
Q4.1 Have you explicitly mapped the full consequence tree for each type of action your agent can execute?

With agents, consequences compound in ways human-speed systems don't encounter: scale (agent executes thousands of actions, not dozens), speed (consequences materialize faster than human detection), complexity (agent actions may have non-obvious downstream effects), and interpretation (agent's misunderstanding creates systematic bias in outcomes). Immediate consequences from financial transactions include funds transferred, account balances changed, transaction fees incurred, and regulatory reporting triggered. Downstream consequences include customer relationship impact, liquidity effects, compliance impact, reputational harm, and legal liability. Without consequence mapping, you can't assess risk appropriately, design adequate controls, determine appropriate insurance coverage, or defend liability assignment. Organizations focus on whether actions can execute, not on what happens after execution.

Answer "Controlled" if
  • Full consequence mapping for each action type documented
  • Worst-case scenarios documented and reviewed
  • Understanding of cascading and downstream effects
  • Quantified risk: financial exposure, reputational impact, legal liability
  • Regular review and updates as agent capabilities evolve
Answer "Gap identified" if
  • Haven't mapped consequences beyond immediate execution
  • No worst-case analysis or stress testing
  • Don't know cascading consequences — one action triggering others
  • Haven't quantified financial, reputational, or legal exposure
Red flags
  • "Consequences are the same as if a human did it" (but at different scale and speed)
  • "Agent actions are low-risk" (without quantifying potential harm)
  • "We'll handle problems reactively" (assumes you'll detect in time)
  • "Worst-case is small financial loss" (doesn't account for reputational or legal harm)
Gap: You don't know what can go wrong when agent actions produce unintended consequences. You can't design appropriate controls (don't know what to prevent), can't assess risk properly (don't know exposure), can't determine insurance needs (don't know potential liability), and can't defend in legal review (didn't anticipate harm). When something goes wrong, you'll be asked: "Did you consider this possibility?" If the answer is no, that's negligence.
Q4.2 Do you have an explicit, legally reviewed liability model for agent-initiated actions that cause harm?

Traditional liability is clear: "Person X performed action Y, is responsible for consequence Z." With agents, liability is ambiguous across multiple scenarios. Misinterpreted context causes financial loss: agent retrieves old case note, interprets it as current instruction, executes 500 inappropriate transactions before detection, customer loses $50,000 — who is liable? Agent violates privacy regulation: agent accesses customer PII based on ambiguous query interpreted as permission request, regulatory fine $100,000 — who is liable? Agent causes reputational harm: agent sends inappropriate communication to 10,000 customers based on misinterpreted sentiment, social media backlash, customer churn — who is liable? Legal precedent for AI agent liability is still developing. Without explicit liability assignment, you face legal uncertainty, insurance gaps, employee personal liability risk, and user confusion.

Answer "Controlled" if
  • Explicit liability model documented and legally reviewed
  • Clear assignment for different harm scenarios
  • Employment agreements address agent design and deployment liability
  • Insurance explicitly covers agent-caused harm
  • Terms of service clearly explain liability to users
Answer "Gap identified" if
  • Liability is ambiguous — could be multiple parties depending on situation
  • No legal review of agent liability model
  • Terms of service don't explicitly address agent-initiated actions
  • Insurance may not cover AI agent-caused harm — haven't verified
Red flags
  • "The organization is liable" (too broad, no individual accountability)
  • "Our terms of service cover this" (but don't explicitly address agent interpretation)
  • "Standard insurance covers it" (may not cover AI agent-specific risks)
  • "We'll let legal sort it out if something happens" (reactive, expensive)
Gap: You have liability exposure without a defensible assignment model. When harm occurs, multiple parties may face legal risk: the organization (regulatory liability, civil suits), executives (fiduciary duty, negligence), developers (professional liability if design was flawed), operators (negligence in deployment and oversight), and users (if terms incorrectly assign liability to them). This gap should be closed before production deployment, not after an incident.
Q4.3 Does your monitoring detect agent-caused harm at agent speed, not human speed?

Agents operate at machine speed. Without real-time monitoring, problems compound. Agent executes 1,000 incorrect actions in 10 minutes. Detection happens hours or days later. Reversal is complex or impossible. Harm has already occurred. At human speed, mistakes are caught and corrected incrementally. At agent speed, mistakes scale before detection. Your monitoring needs to operate at agent speed, not human speed. "We monitor transaction volumes" is not monitoring correctness or appropriateness. "Users report problems" means users may not detect subtle issues. "We review in weekly audits" is insufficient when the agent operates at 10,000 actions per hour. Organizations implement monitoring designed for human-speed systems, which is insufficient for agent-speed execution.

Answer "Controlled" if
  • Real-time monitoring exists for critical actions
  • Automated alerts trigger for anomalies or policy violations
  • Detection operates at agent speed (seconds to minutes, not hours to days)
  • Ability to pause or stop agent when problems are detected
  • Clear escalation process with defined SLAs
  • Monitoring and response processes are tested regularly
Answer "Gap identified" if
  • No real-time monitoring — only periodic review
  • Detection depends on users reporting problems
  • Monitoring tracks volume or performance, not correctness or appropriateness
  • Response time measured in days when agent operates in seconds
Red flags
  • "We'll see problems in system logs" (but logs aren't actively monitored)
  • "Error rates are low so monitoring isn't critical" (until error rate spikes)
  • "We monitor and catch errors quickly" (measured in days, agent executes in seconds)
  • "We'll add monitoring if we see problems" (reactive, not preventive)
Gap: Your detection capability is too slow for agent speed. By the time you detect a problem, the agent may have executed thousands of incorrect actions, caused compounding harm across multiple systems and customers, and created consequences that are expensive or impossible to reverse. You're operating blind: the agent is executing, but you can't see whether those executions are appropriate until well after consequences have materialized.
Gaps requiring attention
Prioritization framework
Address firstContext source mapping — Q1.1 (entry point for all other failures)
Address firstContext control — Q1.2 ("who controls" questions are highest priority)
Address firstAuthentication at interpretation boundary — Q1.4
Address firstLiability model — Q2.3 and Q4.2 (legal exposure without defensible model)
Address firstAudit trail completeness — Q3.2 (regulatory review requires reconstruction of reasoning)
Within 30 daysExplicit authorization layer — Q2.2
Within 30 daysEscalation trigger enforcement — Q2.4
Within 30 daysExecution controls in code — Q3.1
Within 30 daysReversal capability mapping — Q3.3
Within 30 daysReal-time monitoring at agent speed — Q4.3
Document and monitorConflict resolution rules — Q1.3 (if agent operates in low-ambiguity environments)
Document and monitorTool permission scoping — Q2.1 (if permissions are currently narrow)
Document and monitorOptimization resistance — Q3.4 (if agent has limited task autonomy)
Document and monitorConsequence mapping — Q4.1 (if action types are limited and well-understood)
Next steps

Close these gaps. The Authentication Implementation Playbook covers step-by-step closure methodology for each control surface this audit maps: trust boundary architecture, authorization layer design, audit trail requirements that satisfy regulatory review, and escalation enforcement patterns that hold at machine speed.

Join the waitlist for implementation access →