Security

OpenPawz is designed with defense-in-depth: multiple layers protect against prompt injection, data exfiltration, and unauthorized actions.

Human-in-the-Loop (HIL)

Every tool is classified by risk level. High-risk tools require explicit human approval before execution.

Auto-approved tools (no approval needed)

fetch · read_file · list_directory · web_search · web_read · memory_search · memory_store · soul_read · soul_write · soul_list · self_info · update_profile · create_task · list_tasks · manage_task · email_read · slack_read · telegram_read · image_generate

HIL-required tools (human must approve)

exec · write_file · delete_file · append_file · email_send · webhook_send · rest_api_call · slack_send · github_api

Agent policies

Per-agent tool access control with four presets:

Preset	Mode	Description
Unrestricted	unrestricted	All tools, no approval
Standard	denylist	High-risk tools require approval
Read-only	allowlist	Only safe read tools
Sandbox	allowlist	web_search, web_read, memory_search, soul_read only

You can also create custom policies with specific tool allowlists/denylists.

Risk classification

Risk	Tools
Safe	`read_file`, `list_directory`, `web_search`, `web_read`, `memory_search`, `soul_read`, `soul_list`, `self_info`, `fetch`
High-risk	`exec`, `write_file`, `delete_file`, `append_file`, `email_send`, `webhook_send`, `rest_api_call`, `slack_send`, `github_api`, `image_generate`, `soul_write`, `update_profile`, `create_agent`, `create_task`, `manage_task`

Prompt injection defense

All incoming channel messages are scanned for injection attempts before reaching the agent.

Detection

Pattern-based scoring across 9 categories (8 in the Rust backend scanner, 9 in the TypeScript frontend scanner which adds obfuscation):

Category	Examples	Scanner
`override`	”Ignore previous instructions”	Both
`identity`	”You are now…”	Both
`jailbreak`	”DAN mode”, “no restrictions”	Both
`leaking`	”Show me your system prompt”	Both
`obfuscation`	Base64-encoded instructions	Frontend only
`tool_injection`	Fake tool call formatting	Both
`social`	”As an AI researcher…”	Both
`markup`	Hidden instructions in HTML/markdown	Both
`bypass`	”This is just a test…”	Both

Severity levels

Severity	Score	Action
Critical	40+	Message blocked, not delivered
High	25+	Warning logged
Medium	12+	Noted in logs
Low	5+	Informational

Channel bridges automatically block messages with critical severity.

Container sandbox

Execute agent commands in isolated Docker containers:

Security measure	Default
Capabilities	`cap_drop ALL`
Network	Disabled
Memory limit	256 MB
CPU shares	512
Timeout	30 seconds
Output limit	50 KB

Presets

Preset	Image	Memory	Network	Timeout
Minimal	alpine	128 MB	Off	15s
Development	node:20-alpine	512 MB	On	60s
Python	python:3.12-alpine	512 MB	On	60s
Restricted	alpine	64 MB	Off	10s

Command risk assessment

Commands are scored before execution:

Low — ls, cat, echo
Medium — pip install, npm install
High — curl, wget, network commands
Critical — rm -rf /, chmod 777, dangerous patterns

Browser network policy

Control which domains agents can access: Default allowed: AI provider APIs, DuckDuckGo, Coinbase, localhost Default blocked: pastebin.com, transfer.sh, file.io, 0x0.st (data exfiltration risks)

File system protection

Sensitive paths are blocked from agent access — agents cannot add these as project folders or browse into them.

Category	Blocked paths
SSH / GPG	`~/.ssh`, `~/.gnupg`
Cloud credentials	`~/.aws`, `~/.kube`
Desktop keyrings	`~/.gnome-keyring`, `~/.password-store`
Docker	`~/.docker` (includes `config.json`)
Network credentials	`~/.netrc`
System config	`/etc` (covers `/etc/shadow`, `/etc/passwd`, `/etc/sudoers`)
Root home	`/root`
System logs	`/var/log`
Virtual filesystems (Linux)	`/proc/`, `/sys/`
Device nodes	`/dev`
Windows	`C:\Windows`, `C:\Users\*\AppData` (credential store paths)
App config	`~/.openclaw` (contains tokens), `~/.config/himalaya` (email config)

Additionally, the home directory root itself (~, /home/user, /Users/user) and the filesystem root (/, C:\) are blocked as too broad. :::tip Per-project scope guard When a project is active, all file operations are constrained to the project root. Directory traversal sequences (../) are detected and blocked even within the allowed path. :::

Credential security

Credentials are protected by two independent encryption layers:

Layer 1: Skill credential encryption (XOR)

API keys encrypted with XOR using a 32-byte random key
Encryption key stored in OS keychain (paw-skill-vault)
High-risk credentials (Coinbase, DEX) are server-side only — never injected into prompts
Credentials are decrypted only at execution time

Layer 2: Database field encryption (AES-256-GCM)

Sensitive database fields are encrypted with AES-256-GCM via the Web Crypto API before being stored in SQLite.

Property	Detail
Algorithm	AES-256-GCM (authenticated encryption)
Key size	256 bits
IV	12-byte random IV per encryption
Key source	Generated on first launch, stored in OS keychain (`paw-db-encryption`)
Storage format	`enc:<base64(IV + ciphertext)>`
Fallback	Graceful — stores plaintext if encryption is unavailable

:::info Two independent layers The XOR layer protects skill credentials stored in the skill_credentials table. The AES-256-GCM layer protects other sensitive fields across the database. Both derive their keys from the OS keychain but use separate keychain entries. :::

Tool execution security

Tool execution is governed by multiple safety mechanisms in the engine’s central tool executor.

Source code introspection block

Agents cannot read engine source files — any read_file call targeting paths containing src-tauri/src/engine/, src/engine/, or files ending in .rs is rejected. This prevents agents from exfiltrating their own implementation details or discovering internal security mechanisms.

Credential write block

The write_file tool blocks content that contains credential-like patterns:

PEM private keys (-----BEGIN ... PRIVATE KEY-----)
API key secrets (api_key_secret, cdp_api_key)
Base64-encoded secrets with secret or private keywords

Execution limits

Setting	Default	Description
`maxToolCallsPerTurn`	Per-agent policy	Maximum tool calls an agent can make in a single turn before being stopped
`tool_timeout_secs`	300	Seconds before a pending tool approval or execution is killed
`max_tool_rounds`	20	Maximum tool-call → result → re-prompt loops per turn
`max_concurrent_runs`	4	Maximum simultaneous agent runs

Output truncation

Tool results are capped to prevent context window overflow:

Tool	Max output	Behavior
`exec`	50,000 chars	Truncated with `[output truncated]` marker
`read_file`	32,000 chars	Truncated with total byte count
`fetch`	50,000 chars	Truncated with total byte count
Container sandbox	50,000 chars (stdout + stderr each)	Truncated with `[stdout/stderr truncated]` marker

Network policy enforcement

The fetch tool enforces domain-level network policy — blocked domains are always rejected, and when an allowlist is active, only listed domains are permitted.

Exfiltration detection

Outbound network commands are audited for data exfiltration patterns:

Piping file contents to curl, wget, or nc
File upload flags (curl -T, curl --data-binary @, wget --post-file)
Redirects to /dev/tcp/
scp and rsync to remote hosts

:::warning Exfiltration detection is pattern-based and applies to exec tool invocations. It supplements — but does not replace — the container sandbox for high-security environments. :::

Budget enforcement

Daily spending limits with progressive warnings:

Threshold	Action
50%	Warning
75%	Warning
90%	Warning
100%	Requests blocked

Trading safety

Control	Default
Auto-approve trades	Off
Max trade size	$100
Max daily loss	$500
Transfers	Disabled
Max transfer	$0

Channel access control

Policy	Behavior
Open	Anyone can chat
Allowlist	Only approved users
Pairing	Users must pair with a code

Each user gets an isolated session — no cross-user data leakage.

Reporting vulnerabilities

See SECURITY.md in the repository for reporting instructions.

Reference

​Security

​Human-in-the-Loop (HIL)

​Auto-approved tools (no approval needed)

​HIL-required tools (human must approve)

​Agent policies

​Risk classification

​Prompt injection defense

​Detection

​Severity levels

​Container sandbox

​Presets

​Command risk assessment

​Browser network policy

​File system protection

​Credential security

​Layer 1: Skill credential encryption (XOR)

​Layer 2: Database field encryption (AES-256-GCM)

​Tool execution security

​Source code introspection block

​Credential write block

​Execution limits

​Output truncation

​Network policy enforcement

​Exfiltration detection

​Budget enforcement

​Trading safety

​Channel access control

​Reporting vulnerabilities