← Security Center

đŸ—ī¸ Secure Agent Architecture

Building AI agent systems that are secure by design. Defense in depth.

The Golden Rule

Local. Decentralized. Self-controlled.

Every external agent – regardless of provider, regardless of GitHub stars – is a potential risk. Not because providers are malicious, but because any system that browses the web and interprets instructions is inherently attackable.

đŸŽ¯ The Trust Model

HIGH TRUST
Direct user input Your system prompt Verified local files
MEDIUM TRUST
Known API responses Trusted internal docs Signed configurations
LOW TRUST
External websites User-provided URLs Email content
ZERO TRUST
Unknown repos Agent-to-agent messages Encoded content

✅ Security Checklist

Execution Environment

Data Boundaries

Runtime Controls

Supply Chain

â˜ī¸ Cloud vs Local Agents

Aspect Cloud Agent Local Agent
Data Privacy Data sent to external servers Data stays on your machine
Auditability Black box, limited visibility Full logs and source access
Update Control Provider controls updates You control when to update
Availability Depends on internet/service Works offline
Attack Surface Internet-facing API Local only (if configured)
Model Quality Often better models Smaller models, improving

đŸ›Ąī¸ Defense in Depth

No single protection is sufficient. Layer multiple defenses:

1

Prompt Shield

System prompt instructions that reject external commands

Apply Now →
2

Sandboxing

Run agent in container with minimal permissions

docker run --read-only --network=none
3

Input Sanitization

Filter content before agent processes it

Strip hidden text, decode base64, check URLs
4

Output Monitoring

Watch for suspicious agent behavior patterns

Detection Guide →
5

Action Approval

Require human confirmation for dangerous operations

git push, npm publish, file delete

⚡ Quick Security Wins

If you can only do five things:

  1. Apply the Prompt Shield 30 seconds, immediate protection
  2. Run in Docker with --network=none for sensitive work Eliminates data exfiltration risk
  3. Never auto-approve shell commands Review every command before execution
  4. Check CLAUDE.md before opening unknown repos Prevent config-based attacks
  5. Use separate agent instances for sensitive vs browsing tasks Contain potential compromises

📚 Further Reading