Authentication, authorization, rate limiting, and input validation.
The Prompt Engineering Project February 25, 2025 12 min read
Quick Answer
AI tool security involves protecting systems where LLMs invoke external tools, APIs, or code execution. Key defenses include strict input validation, least-privilege permissions, sandboxed execution environments, output filtering, and audit logging. Without these safeguards, prompt injection attacks can escalate tool access into data exfiltration or unauthorized actions.
The moment you give an AI agent the ability to call tools, you have moved from a system that generates text to a system that takes actions. It can read databases, write files, send emails, execute code, and interact with external services. Every one of those capabilities is a potential attack surface. Every tool invocation is a trust decision. And the default security posture of most AI agent frameworks is approximately zero.
This is not a theoretical concern. Production AI agents today have access to customer databases, internal APIs, cloud infrastructure, and communication channels. A prompt injection that convinces an agent to call a tool with attacker-controlled parameters is not a hypothetical -- it is an incident report waiting to be written. Securing tool use requires the same defense-in-depth thinking you apply to any system that executes untrusted input. Here are the seven layers that matter.
Layer 1: Authentication
Authentication answers the question: who is making this request? In a tool-use system, the answer is never simply "the AI agent." The agent is acting on behalf of a user, a service account, or a scheduled job. The identity that matters is the upstream principal -- the human or system that initiated the conversation that led to the tool call.
Every tool invocation must carry the identity of the originating principal, not just the agent's service credentials. This means propagating authentication context through the entire call chain. When an agent calls a database tool, the database tool should know which user's session triggered the query, not just that "the agent service" is asking.
auth-middleware.ts
interface ToolContext {
userId: string
sessionId: string
roles: string[]
tokenExpiry: number
sourceIp: string
}
function authMiddleware(
handler: ToolHandler
): ToolHandler {
return async (params, context: ToolContext) => {
// Verify the session token has not expired
if (Date.now() > context.tokenExpiry) {
throw new AuthError('Session expired', 401)
}
// Verify the user identity against the auth provider
const verified = await authProvider.verify(
context.userId,
context.sessionId
)
if (!verified) {
throw new AuthError('Invalid session', 401)
}
// Attach verified identity to the tool execution
return handler(params, {
...context,
verified: true,
verifiedAt: Date.now(),
})
}
}
Three authentication patterns work in practice. API keys are the simplest: each user or service has a unique key that is passed with every tool invocation. They are easy to implement but hard to rotate and impossible to scope. OAuth tokens are better for user-facing systems: they carry scopes, expire automatically, and can be revoked without changing credentials. Session-based authentication works when the agent operates within a web application context where a session cookie or JWT already establishes identity.
The critical rule is that agent credentials and user credentials are never the same thing. The agent may have its own service account for accessing tool infrastructure, but that service account should never be used to determine what the user is allowed to do. Conflating agent identity with user identity is the most common authentication mistake in AI systems.
The agent is not the principal. It is a proxy. Every tool call must carry the identity of the human or system that initiated the request.
Layer 2: Authorization
Authentication tells you who is asking. Authorization tells you what they are allowed to do. In a tool-use system, authorization operates at three levels: which tools can this user access, what parameters can they pass, and which resources can those parameters reference?
Tool-level permissions are the coarsest control. An admin user might have access to the database_query tool while a regular user does not. This is necessary but insufficient. A user with access to the database query tool should not automatically be able to query every table. Resource scoping limits what a tool can touch based on who is asking.
Default-deny is not optional. If a tool does not have an explicit permission entry for the requesting user's role, the call must fail. Never fall through to "allow" as a default.
Layer 3: Rate Limiting
AI agents are capable of generating tool calls at machine speed. Without rate limiting, a confused or compromised agent can exhaust API quotas, overwhelm databases, or rack up unbounded costs in seconds. Rate limiting is not just about protecting external services -- it is about protecting your own infrastructure from your own agents.
Effective rate limiting for AI tool use requires two dimensions: per-tool limits and per-session limits. Per-tool limits prevent any single tool from being called too frequently, regardless of which agent or user is making the request. Per-session limits prevent any single conversation from consuming a disproportionate share of resources.
A particularly dangerous scenario is the agent loop: the agent calls a tool, receives an error, and retries immediately in an infinite loop. Each retry consumes tokens for the tool call and tokens for the model's reasoning about why it should retry. Without per-session rate limiting, a single stuck conversation can generate hundreds of tool calls in minutes. Set a hard ceiling on tool calls per conversation and force the agent to stop when it hits that ceiling.
Layer 4: Input Validation
This is where most AI tool-use systems are weakest. The model generates parameters for a tool call, and the system passes those parameters directly to the tool without validation. This is the equivalent of passing user input to a SQL query without sanitization. The model is an untrusted input source. Its parameters must be validated with the same rigor you apply to any external input.
Schema validation is the first line of defense. Every tool should have a strict JSON Schema that defines the exact types, ranges, formats, and constraints for every parameter. If the model produces a parameter that does not match the schema, the call fails before it reaches the tool implementation.
input-validation.ts
import { z } from 'zod'
const DatabaseQuerySchema = z.object({
query: z
.string()
.max(2000)
.refine(
(q) => !q.toLowerCase().includes('drop'),
'DROP statements are not allowed'
)
.refine(
(q) => !q.toLowerCase().includes('delete'),
'DELETE statements are not allowed'
),
table: z.enum([
'users',
'orders',
'products',
'analytics',
]),
limit: z.number().int().min(1).max(1000).default(100),
offset: z.number().int().min(0).default(0),
})
const FileWriteSchema = z.object({
path: z
.string()
.refine(
(p) => !p.includes('..'),
'Path traversal is not allowed'
)
.refine(
(p) => p.startsWith('/workspace/'),
'Writes must be within /workspace/'
),
content: z.string().max(100_000),
encoding: z.enum(['utf-8', 'base64']).default('utf-8'),
})
function validateToolInput<T>(
schema: z.ZodSchema<T>,
input: unknown
): T {
const result = schema.safeParse(input)
if (!result.success) {
throw new ValidationError(
'Invalid tool parameters',
result.error.issues
)
}
return result.data
}
Beyond schema validation, certain tools require semantic validation. A file write tool should check that the path does not traverse outside its sandbox. A database query tool should check that the query does not contain destructive operations. An email tool should check that the recipient is within the allowed domain. These checks cannot be expressed in a JSON Schema alone -- they require custom validation logic that understands what the tool does and what the risks are.
The model is an untrusted input source. Its parameters must be validated with the same rigor you apply to any user-submitted form data.
Layer 5: Output Sanitization
Tool outputs flow back to the model as context for its next response. If a tool returns sensitive data -- API keys, passwords, personal information, internal system details -- that data becomes part of the model's context and may be included in the response to the user. Output sanitization ensures that tool results are filtered before they reach the model.
The most common failure here is database tools that return full rows including columns the user should never see. A customer lookup tool that returns the full customer record -- including internal notes, credit card tokens, and support flags -- is a data leak waiting to happen. The tool should return only the fields that the requesting user is authorized to see.
output-sanitizer.ts
interface SanitizationRule {
toolId: string
redactFields: string[]
maskPatterns: { pattern: RegExp; replacement: string }[]
maxOutputLength: number
}
const rules: SanitizationRule[] = [
{
toolId: 'customer_lookup',
redactFields: [
'ssn',
'credit_card',
'internal_notes',
'password_hash',
],
maskPatterns: [
{
pattern: /\b\d{3}-\d{2}-\d{4}\b/g,
replacement: '***-**-****',
},
{
pattern: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
replacement: '[email redacted]',
},
],
maxOutputLength: 5000,
},
]
function sanitizeOutput(
toolId: string,
output: any,
userRoles: string[]
): any {
const rule = rules.find((r) => r.toolId === toolId)
if (!rule) return output
let sanitized = structuredClone(output)
// Remove redacted fields
if (typeof sanitized === 'object' && sanitized !== null) {
for (const field of rule.redactFields) {
delete sanitized[field]
}
}
// Apply pattern masking to string output
let text = JSON.stringify(sanitized)
for (const { pattern, replacement } of rule.maskPatterns) {
text = text.replace(pattern, replacement)
}
// Enforce maximum output length
if (text.length > rule.maxOutputLength) {
text = text.slice(0, rule.maxOutputLength)
}
return JSON.parse(text)
}
Sanitization rules should be defined per-tool and per-role. An admin may see fields that a regular user cannot. The sanitizer must know who is asking, not just which tool is responding.
Layer 6: Audit Logging
Every tool invocation must be logged. This is not optional, and it is not sufficient to log only errors. You need a complete record of every tool call: who triggered it, which tool was called, what parameters were passed, what the tool returned, and how long it took. This log is your compliance trail, your debugging tool, and your anomaly detection input.
The audit log should be append-only, tamper-evident, and stored separately from the tool execution environment. If an attacker compromises the agent system, they should not be able to alter the audit log to cover their tracks. In regulated industries, this log may also need to satisfy specific retention and access requirements.
A subtle but important detail: the audit log must redact sensitive parameters before writing them. If a user passes a password to an authentication tool, you do not want that password sitting in your audit log in plaintext. Apply the same redaction logic to log entries that you apply to tool outputs. The goal is a complete record of what happened without creating a new repository of sensitive data.
Layer 7: Sandboxing
The final layer is execution isolation. When an AI agent calls a tool that executes code, writes files, or interacts with system resources, that execution must happen in an environment where the blast radius of a failure or exploit is contained. A tool that executes user-provided code should not have access to the host filesystem. A tool that makes HTTP requests should not be able to reach internal services on the private network.
Sandboxing strategies range from lightweight to heavy. Process-level isolation uses separate OS processes with restricted permissions. Container-level isolation uses Docker or similar runtimes to create ephemeral environments with constrained filesystem, network, and resource access. VM-level isolation uses lightweight virtual machines like Firecracker for maximum separation at the cost of startup latency.
1
Filesystem isolation
Mount only the directories the tool needs, read-only where possible. Use tmpfs for scratch space that is automatically cleaned up.
2
Network isolation
Whitelist only the external endpoints the tool needs to reach. Block access to internal services, metadata endpoints, and the local network.
3
Resource limits
Set hard caps on CPU time, memory, and disk usage. A runaway tool should be killed by the OS, not by your application code.
4
Time limits
Every tool execution must have a timeout. If a tool has not completed within its timeout, kill the process and return an error to the agent.
5
Ephemeral environments
Create a fresh sandbox for each tool invocation. Never reuse execution environments across different users or sessions. State leakage between invocations is a security vulnerability.
Defense in depth means that no single layer is responsible for security. When one layer fails -- and eventually, one will -- the remaining layers contain the damage.
Putting It Together
These seven layers form a defense-in-depth strategy for AI tool use. No single layer is sufficient on its own. Authentication without authorization is useless. Authorization without input validation is bypassable. Input validation without output sanitization leaks data. Audit logging without sandboxing means you can see the damage but not prevent it.
The practical approach is to implement these as middleware that wraps every tool handler. Each tool call passes through the full stack: authenticate, authorize, rate limit, validate inputs, execute in a sandbox, sanitize outputs, and log everything. The tool implementation itself should be concerned only with its core functionality. Security is an infrastructure concern, not a tool concern.
Start with authentication, authorization, and input validation -- these three layers prevent the most common and most dangerous failures. Add rate limiting once you have agents in production and can observe their calling patterns. Add output sanitization when your tools touch sensitive data. Add sandboxing when your tools execute code or interact with system resources. Add audit logging from day one -- it costs almost nothing and saves everything when you need to investigate an incident.
Key Takeaways
1
AI agents with tool access are action-taking systems, not text generators. Every tool call is a trust decision that requires the same security rigor as any API endpoint.
2
The agent is a proxy, not a principal. Propagate the originating user identity through the entire tool call chain and authorize based on the human, not the agent.
3
Rate limit at multiple dimensions: per-tool, per-session, per-user, and global. Without limits, a confused agent can exhaust resources in seconds.
4
Validate every parameter the model generates with the same discipline you apply to user-submitted form data. The model is an untrusted input source.
5
Sanitize tool outputs before they re-enter the model context. Sensitive data in tool results becomes sensitive data in user-facing responses.
6
Log every tool invocation with full context: who, what, when, and with what parameters. The audit trail is your only debugging tool after an incident.
7
Sandbox tool execution environments. Contain the blast radius. No tool should have access to resources beyond what it strictly needs.