Skip to content

SCP Security Architecture

Version: 0.1 (in progress — updated as each domain is implemented) Owner: Ohana Consulting LLC Audience: Internal engineering reference; basis for customer security review


Overview

SCP governs AI agent behavior in regulated environments. Its security model has six domains, implemented in order of deployment criticality:

# Domain Status Section
1 Audit Log Integrity ✅ Complete §1
2 Agent API Key Lifecycle ✅ Complete §2
3 Secrets Management ✅ Complete §3
4 Transport Security (TLS) ✅ Complete §4
5 Telemetry Beacon Security ✅ Complete §5
6 License Key Enforcement ✅ Complete §6
7 Admin API Authentication ✅ Complete §7

1. Audit Log Integrity

Purpose

The audit log is the foundation of the SCP governance claim. Every agent context request is recorded. If the log can be altered, the compliance story collapses. The log must be tamper-evident and independently verifiable.

Threat Model

Threat Mitigation
Direct SQL UPDATE/DELETE against audit table Immutability trigger raises exception on any DML modification
Superuser bypassing trigger via DDL RLS with FORCE — applies even to table owner for DML
Silent record insertion/deletion between records Hash chain — any gap or reorder breaks the chain
Application bug writing empty/corrupt hashes CHECK constraints reject empty record_hash / previous_hash

Database Design

Table: agent_context_requests

Primary audit table. Records every context request — successful or rejected.

Column Type Notes
request_id TEXT PK UUID per request
agent_id TEXT FK Requesting agent
task_type TEXT Intent requested
request_params JSONB Scoping parameters
intent_valid BOOLEAN Whether intent was allowed
validation_errors JSONB Populated if intent rejected
context_bundle_ids JSONB Bundles included in response
context_scd_ids JSONB SCDs included in response
processing_time_ms INTEGER Latency
requested_at TIMESTAMPTZ Wall clock at insert
chain_sequence BIGSERIAL Strict insertion order for chain traversal
previous_hash TEXT Hash of prior record (or genesis hash)
record_hash TEXT SHA256 of this record's content + previous_hash

Constraints: - CHECK (previous_hash != '') — empty string not permitted - CHECK (record_hash != '') — empty string not permitted - Unique index on chain_sequence - Records that predate hash chain implementation carry sentinel value LEGACY_PRE_AUDIT

Table: audit_chain_genesis

Single row per deployment. Stores the genesis hash — the previous_hash value for the first audit record inserted after deployment initialization.

Column Type Notes
id SERIAL PK
genesis_hash TEXT SHA256("scp-genesis:" + deployment_id + initialized_at)
initialized_at TIMESTAMPTZ Set once at deployment init

Database Roles

Role Permissions Used by
scp_app INSERT on agent_context_requests; full access to agents, agent_api_keys Application service
scp_audit SELECT on agent_context_requests, audit_chain_genesis Audit/verify endpoint, read-only queries

Production note: The application must connect to PostgreSQL as scp_app, not postgres. The postgres superuser role bypasses FORCE ROW LEVEL SECURITY for non-trigger operations.

Immutability Enforcement

Two independent layers:

Layer 1 — Trigger (audit_immutable)

BEFORE UPDATE OR DELETE ON agent_context_requests
 RAISE EXCEPTION 'Audit records are immutable'
Fires for all DML including from superuser sessions.

Layer 2 — Row Level Security

ALTER TABLE agent_context_requests FORCE ROW LEVEL SECURITY;
-- Policies: SELECT (all), INSERT (all)
-- No UPDATE or DELETE policy → denied
FORCE causes RLS to apply even to the table owner role. Only bypassed by pg_bypass_rls privilege (not granted to scp_app).

Both layers must be independently defeated for a modification to succeed.

Hash Chain Algorithm

Every new audit record is inserted with:

record_content = chain_sequence
              || request_id
              || agent_id
              || task_type
              || str(intent_valid)
              || requested_at.isoformat()

record_hash = SHA256(record_content + previous_hash)

Where previous_hash is: - The record_hash of the record with chain_sequence = N-1 for all N > 1 - The genesis hash (from audit_chain_genesis) for the first record

Race condition prevention: Inserts acquire a PostgreSQL transaction-level advisory lock (pg_advisory_xact_lock) keyed to the audit table before reading the previous hash and inserting. This serializes concurrent writes without requiring SERIALIZABLE isolation on the entire connection.

Chain Verification (GET /audit/verify)

The verify endpoint walks the entire chain and recomputes every hash:

GET /audit/verify
Authorization: requires scp_audit role or admin API key

Response:
{
  "status": "ok" | "tampered",
  "record_count": 1042,
  "legacy_record_count": 15,   // records with LEGACY_PRE_AUDIT sentinel
  "first_chained_sequence": 16,
  "last_verified_at": "2026-04-18T14:00:00Z",
  "tampered_at_sequence": null  // populated if status == "tampered"
}

Algorithm: 1. Load genesis hash from audit_chain_genesis 2. Select all records with record_hash != 'LEGACY_PRE_AUDIT' ordered by chain_sequence ASC 3. For each record: recompute record_hash, compare with stored value 4. Return first sequence number where mismatch is detected, or "ok" if chain is intact

Files

File Purpose
control-plane/database/schemas/agent_registry.sql Table definition, trigger, RLS, roles
control-plane/database/migrations/001_audit_integrity.sql Migration for existing databases
control-plane/api_server/routers/audit.py GET /audit/verify endpoint
control-plane/shared/audit.py Hash computation, chain write, genesis init, chain verify
control-plane/api_server/main.py Calls initialize_genesis() at startup
control-plane/context_service/orchestrator.py Calls write_audit_record() on every context request

2. Agent API Key Lifecycle

Purpose

Every agent authenticates with an API key. A compromised key with no revocation path is an unacceptable risk in a regulated deployment. Keys must expire, must be revocable immediately, and the authoritative status must be enforced on every request.

Threat Model

Threat Mitigation
Compromised key used indefinitely Configurable expiry (default: 90 days); enforced on every auth check
No path to invalidate a leaked key Explicit revocation: DELETE /api/agents/{id}/api-keys/{key_id}
Expired key still accepted if expiry not re-checked Auth query filters on status = 'active'; expiry also checked in-process
Keys orphaned when agent is revoked Agent revocation cascades to all active keys (status = 'revoked')

Schema Changes

agent_api_keys additions (Migration 002):

Column Type Notes
status TEXT active \| revoked \| expired — authoritative; CHECK constraint
revoked_by TEXT Actor who performed revocation, or system:expiry-job / system:agent-revoked

is_active is retained for backward compatibility. status is authoritative — auth queries filter on status = 'active'.

Authentication Query

Every context request passes through the orchestrator's _authenticate method, which queries:

SELECT k.key_hash, k.expires_at, k.agent_id, a.status as agent_status
FROM agent_api_keys k
JOIN agents a ON k.agent_id = a.agent_id
WHERE k.key_prefix = $1 AND k.status = 'active'

After the DB check, expires_at is also validated in-process as a defense-in-depth layer against any DB-level inconsistency.

Key Lifecycle Events

Event Trigger status after revoked_by
Issue POST /api/agents/{id}/api-keys active
Revoke DELETE /api/agents/{id}/api-keys/{key_id} revoked caller-supplied revoked_by param
Agent revoked DELETE /api/agents/{id} revoked system:agent-revoked
Expiry Hourly background job expired system:expiry-job

The revoked_at timestamp and revoked_by columns on agent_api_keys are the audit record for key lifecycle events. Extension of the hash chain to cover key events is a v0.4 item.

Background Expiry Job

Started as an asyncio task at API server startup. Runs every 3600 seconds:

UPDATE agent_api_keys
SET is_active = FALSE, status = 'expired', revoked_by = 'system:expiry-job'
WHERE status = 'active'
  AND expires_at IS NOT NULL
  AND expires_at < NOW()

Logs count of keys expired per run.

Files

File Purpose
control-plane/database/schemas/agent_registry.sql status, revoked_by columns; status index
control-plane/database/migrations/002_api_key_lifecycle.sql Migration for existing databases
control-plane/shared/models/agents.py APIKey model updated with status, revoked_by, revoked_at
control-plane/api_server/routers/agents.py Key create/revoke/list; agent revoke cascade
control-plane/context_service/orchestrator.py Auth query updated to status = 'active'
control-plane/api_server/main.py Hourly expiry background job

3. Secrets Management

Purpose

SCP requires a database password and a JWT signing secret. These must not be stored in source control, config files, or Docker images. The application must fail loudly at startup if secrets are missing or weak — silent degradation to an insecure default is not acceptable.

Threat Model

Threat Mitigation
Insecure default accepted silently jwt_secret validator rejects known-weak strings and enforces minimum 32 chars
Empty password accepted db_password validator rejects empty string
Weak DB password in production Validator emits warning if db_password == 'postgres'
JWT_SECRET missing from container environment docker-compose uses :? syntax — compose startup fails with a clear error if not set
Secrets in source control .env in .gitignore; verified never committed

Secret Requirements

Secret Minimum How to generate
JWT_SECRET 32 chars, not a known-weak value python -c "import secrets; print(secrets.token_hex(32))"
DB_PASSWORD Non-empty; 24+ chars recommended for production python -c "import secrets; print(secrets.token_urlsafe(24))"
DEPLOYMENT_ID UUID, stable per deployment python -c "import uuid; print(uuid.uuid4())"

Startup Validation

settings.py uses Pydantic field_validator decorators:

  • jwt_secret_strength: rejects known-weak strings (change-me-in-production, etc.) and values under 32 characters. Startup raises ValidationError with a generation hint.
  • db_password_production_check: rejects empty string; emits warnings.warn if value is 'postgres'.

Both fields have no defaultpydantic-settings raises ValidationError at startup if they are absent from the environment.

Docker Compose

All services use the :? expansion syntax for required secrets:

DB_PASSWORD: ${DB_PASSWORD:?DB_PASSWORD must be set in .env}
JWT_SECRET:  ${JWT_SECRET:?JWT_SECRET must be set in .env}

docker compose up fails immediately with a clear message if these are not set in the shell environment or .env file.

Production Secret Injection

Environment Mechanism
Azure Azure Key Vault via managed identity
AWS (Marketplace) SSM Parameter Store (documented in docs/deployment/aws-setup.md)
Local / dev .env file — explicitly excluded from git

Local .env Update Required

After this change, the existing local .env value JWT_SECRET=change-me-in-production-please-use-strong-secret will be rejected at startup. Generate a new value:

python -c "import secrets; print(secrets.token_hex(32))"

Update control-plane/.env with the output before starting the server.

Files

File Purpose
control-plane/shared/config/settings.py Validators, no-default fields
control-plane/.env.example Updated with generation hints, removed stale vars
control-plane/docker-compose.yaml :? required-env syntax on all services

4. Transport Security

Purpose

All traffic between agents and SCP must be encrypted in transit. Plaintext connections to any external-facing endpoint are not acceptable in a regulated deployment.

Threat Model

Threat Mitigation
API traffic intercepted in transit Caddy terminates TLS on port 443; all agent traffic goes HTTPS
Context webhook traffic intercepted Caddy proxies context:8003 on port 8443 with TLS
Direct plaintext access to app ports Firewall/security group blocks 8000–8004 from external networks in production
Postgres connection unencrypted DB_SSL_MODE=require enforced via asyncpg ssl=True

Architecture

Agent → HTTPS :443 → Caddy → HTTP api:8000 (internal Docker network)
Agent → HTTPS :8443 → Caddy → HTTP context:8003 (internal Docker network)

TLS termination is at Caddy. Internal service-to-service traffic stays on the scp-network Docker bridge and does not leave the host.

gRPC (port 8002) is proxied at the load balancer layer in production (AWS ALB with TLS termination, as configured in the CloudFormation stack). It is not routed through Caddy in the current setup.

Caddy Configuration

Dev (CADDY_DOMAIN=localhost): - tls internal — Caddy generates a self-signed CA and cert on first run - Run caddy trust once to install the CA in the system trust store - Cert is stored in the caddy_data Docker volume (persists across restarts)

Production (CADDY_DOMAIN=scp.your-domain.com): - Remove tls internal from Caddyfile — Caddy provisions a Let's Encrypt cert via ACME automatically - Requires port 80 accessible for HTTP-01 challenge (or configure DNS-01 for internal deployments) - Cert renewal is automatic

Database SSL

DB_SSL_MODE setting controls asyncpg's ssl parameter:

DB_SSL_MODE asyncpg ssl Use case
disable False Local dev / docker-compose without server certs
require True Any network-exposed deployment

For production, the PostgreSQL server must have SSL enabled (configured at the infrastructure level — RDS, or manually on EC2 with ssl = on in postgresql.conf).

Redis

Redis TLS is not enabled in the docker-compose setup. For production: - Use a Redis instance with TLS enabled (ElastiCache with TLS, or Redis with tls-cert-file configured) - Update REDIS_HOST and REDIS_PORT to point to the TLS endpoint - asyncio-redis connections will need the ssl=True parameter (context service change required)

This is documented as a production-deployment step rather than a docker-compose change.

Environment Variables

Variable Default Production
DB_SSL_MODE disable require
CADDY_DOMAIN localhost your real hostname
CADDY_CONTEXT_PORT 8443 8443 (or as appropriate)

Files

File Purpose
control-plane/Caddyfile Caddy config — TLS termination and reverse proxy
control-plane/docker-compose.yaml Caddy service, caddy_data/caddy_config volumes
control-plane/shared/config/settings.py db_ssl_mode setting
control-plane/database/connection.py Passes ssl=True/False to asyncpg based on db_ssl_mode
control-plane/.env.example DB_SSL_MODE, CADDY_DOMAIN, CADDY_CONTEXT_PORT documented

5. Telemetry Beacon Security

Purpose

SCP phones home daily to report agent counts for usage-based billing. The beacon payload must be: - Minimal — no PII, no agent names, no SCD content - Fixed — the exact set of transmitted fields is defined in code and inspectable by customers - Authenticated — the vendor can verify the payload came from a legitimate deployment

Threat Model

Threat Mitigation
Telemetry transmitting PII or sensitive data BeaconPayload is a fixed Pydantic model — only 6 fields, all non-PII
Customer unable to audit what is sent GET /telemetry/log exposes the exact payload for every report
Vendor receives spoofed or tampered beacons HMAC-SHA256 signature on canonical payload JSON (X-SCP-Signature header)
License key transmitted in plaintext Only SHA256(license_key) is sent — not the key itself

Beacon Payload

The BeaconPayload model defines the complete set of transmitted fields. Nothing else is sent.

Field Type Notes
deployment_id string UUID — no PII
license_key_hash string SHA256(license_key) — not the key
agent_count int Total registered agents
active_agent_count int Active agents
reported_at string ISO-8601 UTC timestamp
scp_version string SCP version string

Payload Signing

If TELEMETRY_SIGNING_SECRET is configured:

  1. Serialize BeaconPayload to canonical JSON (sort_keys=True, compact separators)
  2. Compute HMAC-SHA256(signing_secret, canonical_json)
  3. Attach as X-SCP-Signature: sha256=<hex> request header

The signing secret must match the key registered with Ohana at license activation. Without a configured secret, reports are sent unsigned (beacon is still sent; vendor accepts but cannot verify authenticity).

Customer Inspection (GET /telemetry/log)

GET /telemetry/log?limit=30

Returns the last N beacon log entries — one per daily report — showing: - The exact BeaconPayload that was (or will be) transmitted - Whether the report was signed - Transmission status and any error messages

Customers can verify at any time that no unexpected data leaves the deployment.

Files

File Purpose
control-plane/shared/models/telemetry.py BeaconPayload, TelemetryLogEntry models
control-plane/shared/config/settings.py telemetry_signing_secret, license_key settings
control-plane/telemetry_service/reporter.py _build_beacon(), _sign_payload(), signed HTTP send
control-plane/telemetry_service/main.py GET /telemetry/log endpoint
control-plane/.env.example TELEMETRY_SIGNING_SECRET, LICENSE_KEY documented

6. License Key Enforcement

Purpose

Agent count limits are only meaningful if they can't be bypassed. License keys encode the customer's entitlements and SCP enforces them — at startup and on every agent registration attempt.

Threat Model

Threat Mitigation
Customer exceeds licensed agent count Cap enforced on POST /api/agents — returns HTTP 402 at cap
License key forged or tampered RS256 signature verified against embedded Ohana public key
SCP started with expired license Startup aborted with clear error message
Key rotating without redeployment LICENSE_PUBLIC_KEY setting overrides embedded key

License Key Format

License keys are RS256-signed JWTs issued by Ohana at customer activation. SCP holds the Ohana public key (embedded at build time) to verify signatures — the private key never leaves Ohana.

JWT Claims:

Claim Type Description
sub string Customer identifier
deployment_id string Licensed deployment UUID
agent_cap int Max non-revoked agents (0 = unlimited)
tier string "standard" | "regulated"
iss string Must be "ohana-scp"
iat int Issued-at (Unix timestamp)
exp int Expiry (Unix timestamp)

Startup Validation

On every API server startup:

  1. If LICENSE_KEY is not set: log warning, continue in uncapped mode (dev only)
  2. If set: validate RS256 signature against LICENSE_PUBLIC_KEY or embedded key
  3. Verify iss == "ohana-scp" and required claims are present
  4. If expired: abort startup with a clear error message
  5. If expiring in < 30 days: log warning each startup, continue

Validated claims are stored in a process-level singleton (get_license_claims()).

Agent Cap Enforcement

Every POST /api/agents request:

  1. Reads agent_cap from the cached LicenseClaims (zero overhead — no DB query for the license itself)
  2. If agent_cap > 0: counts current non-revoked agents
  3. If current_count >= agent_cap: returns HTTP 402 with a clear upgrade message

Revoked agents do not count toward the cap — agents that are decommissioned free up capacity.

Files

File Purpose
control-plane/shared/license.py validate_license(), check_license_at_startup(), LicenseClaims, cap singleton
control-plane/shared/config/settings.py license_key, license_public_key settings
control-plane/api_server/main.py License validation at startup, set_license_claims()
control-plane/api_server/routers/agents.py Agent cap enforcement in create_agent
control-plane/.env.example LICENSE_KEY, LICENSE_PUBLIC_KEY documented

7. Admin API Authentication

Purpose

Bundle management, agent registration, and audit admin reads must be authenticated. An unauthenticated admin API allows any network-adjacent actor to register rogue agents, publish arbitrary bundles, revoke legitimate credentials, or read the full audit log.

Threat Model

Threat Mitigation
Unauthenticated agent registration (rogue agent injection) All POST /api/agents requests require valid X-Admin-Key
Unauthenticated bundle publish (malicious governance context) All bundle registry writes require valid X-Admin-Key
Audit log exfiltration via stream or verify endpoints /audit/requests, /audit/stream, /audit/verify, /audit/chain/{id} require X-Admin-Key
Missing or unconfigured admin key silently open Server returns HTTP 503 if ADMIN_API_KEY is not set — no silent open mode
Timing attack on key comparison secrets.compare_digest used for all comparisons

Scope

Admin-gated endpoints (require X-Admin-Key): - All bundle registry operations (/api/bundles/*) - All agent registry operations (/api/agents/*, including API key management) - Audit admin reads: GET /api/audit/verify, /api/audit/requests, /api/audit/stream, /api/audit/chain/{id}

Agent-gated endpoints (require X-API-Key, unchanged): - Context requests: POST /api/context - Output logging: POST /api/audit/output, GET /api/audit/output/{id}

Implementation

The require_admin_key FastAPI dependency (api_server/dependencies.py) is applied at the router level for agents.py and bundles.py, and as an explicit dependency on the four audit admin endpoints. Every request to a gated endpoint triggers the dependency before any handler logic runs.

async def require_admin_key(
    x_admin_key: Annotated[str, Header()] = "",
) -> None:
    settings = get_settings()
    if not settings.admin_api_key:
        raise HTTPException(503, "Admin API not configured")
    if not secrets.compare_digest(x_admin_key, settings.admin_api_key):
        raise HTTPException(401, "Invalid admin API key")

Key Requirements

Setting Requirement How to generate
ADMIN_API_KEY Non-empty; 24+ random bytes recommended python -c "import secrets; print('admin_' + secrets.token_hex(24))"

Production Secret Injection

Environment Mechanism
Azure Azure Key Vault via managed identity
AWS (Marketplace) SSM Parameter Store
Local / dev .env file — excluded from git

Files

File Purpose
control-plane/api_server/dependencies.py require_admin_key dependency
control-plane/api_server/routers/agents.py router = APIRouter(dependencies=[Depends(require_admin_key)])
control-plane/api_server/routers/bundles.py Same
control-plane/api_server/routers/audit.py Explicit dependency on four admin endpoints
control-plane/shared/config/settings.py admin_api_key setting
control-plane/.env.pilot.template ADMIN_API_KEY documented as [MUST CHANGE]

Appendix: Migration History

Migration Description Date
001_audit_integrity.sql Hash chain, immutability trigger, RLS, roles 2026-04-18
002_api_key_lifecycle.sql API key status column (active/revoked/expired), revoked_by 2026-04-18
003_output_logging.sql agent_output_logs table — output-side audit trail linked to context requests 2026-04-18
004_chain_correlation.sql chain_id and parent_request_id on agent_context_requests — multi-agent chain tracing 2026-04-18