SCP Security Architecture¶

Version: 0.1 (in progress — updated as each domain is implemented) Owner: Ohana Consulting LLC Audience: Internal engineering reference; basis for customer security review

Overview¶

SCP governs AI agent behavior in regulated environments. Its security model has six domains, implemented in order of deployment criticality:

#	Domain	Status	Section
1	Audit Log Integrity	✅ Complete	§1
2	Agent API Key Lifecycle	✅ Complete	§2
3	Secrets Management	✅ Complete	§3
4	Transport Security (TLS)	✅ Complete	§4
5	Telemetry Beacon Security	✅ Complete	§5
6	License Key Enforcement	✅ Complete	§6
7	Admin API Authentication	✅ Complete	§7

1. Audit Log Integrity¶

Purpose¶

The audit log is the foundation of the SCP governance claim. Every agent context request is recorded. If the log can be altered, the compliance story collapses. The log must be tamper-evident and independently verifiable.

Threat Model¶

Threat	Mitigation
Direct SQL UPDATE/DELETE against audit table	Immutability trigger raises exception on any DML modification
Superuser bypassing trigger via DDL	RLS with FORCE — applies even to table owner for DML
Silent record insertion/deletion between records	Hash chain — any gap or reorder breaks the chain
Application bug writing empty/corrupt hashes	CHECK constraints reject empty `record_hash` / `previous_hash`

Database Design¶

Table: agent_context_requests

Primary audit table. Records every context request — successful or rejected.

Column	Type	Notes
`request_id`	TEXT PK	UUID per request
`agent_id`	TEXT FK	Requesting agent
`task_type`	TEXT	Intent requested
`request_params`	JSONB	Scoping parameters
`intent_valid`	BOOLEAN	Whether intent was allowed
`validation_errors`	JSONB	Populated if intent rejected
`context_bundle_ids`	JSONB	Bundles included in response
`context_scd_ids`	JSONB	SCDs included in response
`processing_time_ms`	INTEGER	Latency
`requested_at`	TIMESTAMPTZ	Wall clock at insert
`chain_sequence`	BIGSERIAL	Strict insertion order for chain traversal
`previous_hash`	TEXT	Hash of prior record (or genesis hash)
`record_hash`	TEXT	SHA256 of this record's content + previous_hash

Constraints: - CHECK (previous_hash != '') — empty string not permitted - CHECK (record_hash != '') — empty string not permitted - Unique index on chain_sequence - Records that predate hash chain implementation carry sentinel value LEGACY_PRE_AUDIT

Table: audit_chain_genesis

Single row per deployment. Stores the genesis hash — the previous_hash value for the first audit record inserted after deployment initialization.

Column	Type	Notes
`id`	SERIAL PK
`genesis_hash`	TEXT	`SHA256("scp-genesis:" + deployment_id + initialized_at)`
`initialized_at`	TIMESTAMPTZ	Set once at deployment init

Database Roles¶

Role	Permissions	Used by
`scp_app`	INSERT on `agent_context_requests`; full access to `agents`, `agent_api_keys`	Application service
`scp_audit`	SELECT on `agent_context_requests`, `audit_chain_genesis`	Audit/verify endpoint, read-only queries

Production note: The application must connect to PostgreSQL as scp_app, not postgres. The postgres superuser role bypasses FORCE ROW LEVEL SECURITY for non-trigger operations.

Immutability Enforcement¶

Two independent layers:

Layer 1 — Trigger (audit_immutable)

BEFORE UPDATE OR DELETE ON agent_context_requests
→ RAISE EXCEPTION 'Audit records are immutable'

Fires for all DML including from superuser sessions.

Layer 2 — Row Level Security

ALTER TABLE agent_context_requests FORCE ROW LEVEL SECURITY;
-- Policies: SELECT (all), INSERT (all)
-- No UPDATE or DELETE policy → denied

FORCE causes RLS to apply even to the table owner role. Only bypassed by pg_bypass_rls privilege (not granted to scp_app).

Both layers must be independently defeated for a modification to succeed.

Hash Chain Algorithm¶

Every new audit record is inserted with:

record_content = chain_sequence
              || request_id
              || agent_id
              || task_type
              || str(intent_valid)
              || requested_at.isoformat()

record_hash = SHA256(record_content + previous_hash)

Where previous_hash is: - The record_hash of the record with chain_sequence = N-1 for all N > 1 - The genesis hash (from audit_chain_genesis) for the first record

Race condition prevention: Inserts acquire a PostgreSQL transaction-level advisory lock (pg_advisory_xact_lock) keyed to the audit table before reading the previous hash and inserting. This serializes concurrent writes without requiring SERIALIZABLE isolation on the entire connection.

Chain Verification (`GET /audit/verify`)¶

The verify endpoint walks the entire chain and recomputes every hash:

GET /audit/verify
Authorization: requires scp_audit role or admin API key

Response:
{
  "status": "ok" | "tampered",
  "record_count": 1042,
  "legacy_record_count": 15,   // records with LEGACY_PRE_AUDIT sentinel
  "first_chained_sequence": 16,
  "last_verified_at": "2026-04-18T14:00:00Z",
  "tampered_at_sequence": null  // populated if status == "tampered"
}

Algorithm: 1. Load genesis hash from audit_chain_genesis 2. Select all records with record_hash != 'LEGACY_PRE_AUDIT' ordered by chain_sequence ASC 3. For each record: recompute record_hash, compare with stored value 4. Return first sequence number where mismatch is detected, or "ok" if chain is intact

Files¶

File	Purpose
`control-plane/database/schemas/agent_registry.sql`	Table definition, trigger, RLS, roles
`control-plane/database/migrations/001_audit_integrity.sql`	Migration for existing databases
`control-plane/api_server/routers/audit.py`	`GET /audit/verify` endpoint
`control-plane/shared/audit.py`	Hash computation, chain write, genesis init, chain verify
`control-plane/api_server/main.py`	Calls `initialize_genesis()` at startup
`control-plane/context_service/orchestrator.py`	Calls `write_audit_record()` on every context request

2. Agent API Key Lifecycle¶

Purpose¶

Every agent authenticates with an API key. A compromised key with no revocation path is an unacceptable risk in a regulated deployment. Keys must expire, must be revocable immediately, and the authoritative status must be enforced on every request.

Threat Model¶

Threat	Mitigation
Compromised key used indefinitely	Configurable expiry (default: 90 days); enforced on every auth check
No path to invalidate a leaked key	Explicit revocation: `DELETE /api/agents/{id}/api-keys/{key_id}`
Expired key still accepted if expiry not re-checked	Auth query filters on `status = 'active'`; expiry also checked in-process
Keys orphaned when agent is revoked	Agent revocation cascades to all active keys (`status = 'revoked'`)

Schema Changes¶

agent_api_keys additions (Migration 002):

Column	Type	Notes
`status`	TEXT	`active \\| revoked \\| expired` — authoritative; CHECK constraint
`revoked_by`	TEXT	Actor who performed revocation, or `system:expiry-job` / `system:agent-revoked`

is_active is retained for backward compatibility. status is authoritative — auth queries filter on status = 'active'.

Authentication Query¶

Every context request passes through the orchestrator's _authenticate method, which queries:

SELECT k.key_hash, k.expires_at, k.agent_id, a.status as agent_status
FROM agent_api_keys k
JOIN agents a ON k.agent_id = a.agent_id
WHERE k.key_prefix = $1 AND k.status = 'active'

After the DB check, expires_at is also validated in-process as a defense-in-depth layer against any DB-level inconsistency.

Key Lifecycle Events¶

Event	Trigger	`status` after	`revoked_by`
Issue	`POST /api/agents/{id}/api-keys`	`active`	—
Revoke	`DELETE /api/agents/{id}/api-keys/{key_id}`	`revoked`	caller-supplied `revoked_by` param
Agent revoked	`DELETE /api/agents/{id}`	`revoked`	`system:agent-revoked`
Expiry	Hourly background job	`expired`	`system:expiry-job`

The revoked_at timestamp and revoked_by columns on agent_api_keys are the audit record for key lifecycle events. Extension of the hash chain to cover key events is a v0.4 item.

Background Expiry Job¶

Started as an asyncio task at API server startup. Runs every 3600 seconds:

UPDATE agent_api_keys
SET is_active = FALSE, status = 'expired', revoked_by = 'system:expiry-job'
WHERE status = 'active'
  AND expires_at IS NOT NULL
  AND expires_at < NOW()

Logs count of keys expired per run.

Files¶

File	Purpose
`control-plane/database/schemas/agent_registry.sql`	`status`, `revoked_by` columns; status index
`control-plane/database/migrations/002_api_key_lifecycle.sql`	Migration for existing databases
`control-plane/shared/models/agents.py`	`APIKey` model updated with `status`, `revoked_by`, `revoked_at`
`control-plane/api_server/routers/agents.py`	Key create/revoke/list; agent revoke cascade
`control-plane/context_service/orchestrator.py`	Auth query updated to `status = 'active'`
`control-plane/api_server/main.py`	Hourly expiry background job

3. Secrets Management¶

Purpose¶

SCP requires a database password and a JWT signing secret. These must not be stored in source control, config files, or Docker images. The application must fail loudly at startup if secrets are missing or weak — silent degradation to an insecure default is not acceptable.

Threat Model¶

Threat	Mitigation
Insecure default accepted silently	`jwt_secret` validator rejects known-weak strings and enforces minimum 32 chars
Empty password accepted	`db_password` validator rejects empty string
Weak DB password in production	Validator emits warning if `db_password == 'postgres'`
JWT_SECRET missing from container environment	docker-compose uses `:?` syntax — compose startup fails with a clear error if not set
Secrets in source control	`.env` in `.gitignore`; verified never committed

Secret Requirements¶

Secret	Minimum	How to generate
`JWT_SECRET`	32 chars, not a known-weak value	`python -c "import secrets; print(secrets.token_hex(32))"`
`DB_PASSWORD`	Non-empty; 24+ chars recommended for production	`python -c "import secrets; print(secrets.token_urlsafe(24))"`
`DEPLOYMENT_ID`	UUID, stable per deployment	`python -c "import uuid; print(uuid.uuid4())"`

Startup Validation¶

settings.py uses Pydantic field_validator decorators:

jwt_secret_strength: rejects known-weak strings (change-me-in-production, etc.) and values under 32 characters. Startup raises ValidationError with a generation hint.
db_password_production_check: rejects empty string; emits warnings.warn if value is 'postgres'.

Both fields have no default — pydantic-settings raises ValidationError at startup if they are absent from the environment.

Docker Compose¶

All services use the :? expansion syntax for required secrets:

DB_PASSWORD: ${DB_PASSWORD:?DB_PASSWORD must be set in .env}
JWT_SECRET:  ${JWT_SECRET:?JWT_SECRET must be set in .env}

docker compose up fails immediately with a clear message if these are not set in the shell environment or .env file.

Production Secret Injection¶

Environment	Mechanism
Azure	Azure Key Vault via managed identity
AWS (Marketplace)	SSM Parameter Store (documented in `docs/deployment/aws-setup.md`)
Local / dev	`.env` file — explicitly excluded from git

Local `.env` Update Required¶

After this change, the existing local .env value JWT_SECRET=change-me-in-production-please-use-strong-secret will be rejected at startup. Generate a new value:

python -c "import secrets; print(secrets.token_hex(32))"

Update control-plane/.env with the output before starting the server.

Files¶

File	Purpose
`control-plane/shared/config/settings.py`	Validators, no-default fields
`control-plane/.env.example`	Updated with generation hints, removed stale vars
`control-plane/docker-compose.yaml`	`:?` required-env syntax on all services

4. Transport Security¶

Purpose¶

All traffic between agents and SCP must be encrypted in transit. Plaintext connections to any external-facing endpoint are not acceptable in a regulated deployment.

Threat Model¶

Threat	Mitigation
API traffic intercepted in transit	Caddy terminates TLS on port 443; all agent traffic goes HTTPS
Context webhook traffic intercepted	Caddy proxies context:8003 on port 8443 with TLS
Direct plaintext access to app ports	Firewall/security group blocks 8000–8004 from external networks in production
Postgres connection unencrypted	`DB_SSL_MODE=require` enforced via asyncpg `ssl=True`

Architecture¶

Agent → HTTPS :443 → Caddy → HTTP api:8000 (internal Docker network)
Agent → HTTPS :8443 → Caddy → HTTP context:8003 (internal Docker network)

TLS termination is at Caddy. Internal service-to-service traffic stays on the scp-network Docker bridge and does not leave the host.

gRPC (port 8002) is proxied at the load balancer layer in production (AWS ALB with TLS termination, as configured in the CloudFormation stack). It is not routed through Caddy in the current setup.

Caddy Configuration¶

Dev (CADDY_DOMAIN=localhost): - tls internal — Caddy generates a self-signed CA and cert on first run - Run caddy trust once to install the CA in the system trust store - Cert is stored in the caddy_data Docker volume (persists across restarts)

Production (CADDY_DOMAIN=scp.your-domain.com): - Remove tls internal from Caddyfile — Caddy provisions a Let's Encrypt cert via ACME automatically - Requires port 80 accessible for HTTP-01 challenge (or configure DNS-01 for internal deployments) - Cert renewal is automatic

Database SSL¶

DB_SSL_MODE setting controls asyncpg's ssl parameter:

`DB_SSL_MODE`	asyncpg `ssl`	Use case
`disable`	`False`	Local dev / docker-compose without server certs
`require`	`True`	Any network-exposed deployment

For production, the PostgreSQL server must have SSL enabled (configured at the infrastructure level — RDS, or manually on EC2 with ssl = on in postgresql.conf).

Redis¶

Redis TLS is not enabled in the docker-compose setup. For production: - Use a Redis instance with TLS enabled (ElastiCache with TLS, or Redis with tls-cert-file configured) - Update REDIS_HOST and REDIS_PORT to point to the TLS endpoint - asyncio-redis connections will need the ssl=True parameter (context service change required)

This is documented as a production-deployment step rather than a docker-compose change.

Environment Variables¶

Variable	Default	Production
`DB_SSL_MODE`	`disable`	`require`
`CADDY_DOMAIN`	`localhost`	your real hostname
`CADDY_CONTEXT_PORT`	`8443`	`8443` (or as appropriate)

Files¶

File	Purpose
`control-plane/Caddyfile`	Caddy config — TLS termination and reverse proxy
`control-plane/docker-compose.yaml`	Caddy service, caddy_data/caddy_config volumes
`control-plane/shared/config/settings.py`	`db_ssl_mode` setting
`control-plane/database/connection.py`	Passes `ssl=True/False` to asyncpg based on `db_ssl_mode`
`control-plane/.env.example`	`DB_SSL_MODE`, `CADDY_DOMAIN`, `CADDY_CONTEXT_PORT` documented

5. Telemetry Beacon Security¶

Purpose¶

SCP phones home daily to report agent counts for usage-based billing. The beacon payload must be: - Minimal — no PII, no agent names, no SCD content - Fixed — the exact set of transmitted fields is defined in code and inspectable by customers - Authenticated — the vendor can verify the payload came from a legitimate deployment

Threat Model¶

Threat	Mitigation
Telemetry transmitting PII or sensitive data	`BeaconPayload` is a fixed Pydantic model — only 6 fields, all non-PII
Customer unable to audit what is sent	`GET /telemetry/log` exposes the exact payload for every report
Vendor receives spoofed or tampered beacons	HMAC-SHA256 signature on canonical payload JSON (`X-SCP-Signature` header)
License key transmitted in plaintext	Only `SHA256(license_key)` is sent — not the key itself

Beacon Payload¶

The BeaconPayload model defines the complete set of transmitted fields. Nothing else is sent.

Field	Type	Notes
`deployment_id`	string	UUID — no PII
`license_key_hash`	string	`SHA256(license_key)` — not the key
`agent_count`	int	Total registered agents
`active_agent_count`	int	Active agents
`reported_at`	string	ISO-8601 UTC timestamp
`scp_version`	string	SCP version string

Payload Signing¶

If TELEMETRY_SIGNING_SECRET is configured:

Serialize BeaconPayload to canonical JSON (sort_keys=True, compact separators)
Compute HMAC-SHA256(signing_secret, canonical_json)
Attach as X-SCP-Signature: sha256=<hex> request header

The signing secret must match the key registered with Ohana at license activation. Without a configured secret, reports are sent unsigned (beacon is still sent; vendor accepts but cannot verify authenticity).

Customer Inspection (`GET /telemetry/log`)¶

GET /telemetry/log?limit=30

Returns the last N beacon log entries — one per daily report — showing: - The exact BeaconPayload that was (or will be) transmitted - Whether the report was signed - Transmission status and any error messages

Customers can verify at any time that no unexpected data leaves the deployment.

Files¶

File	Purpose
`control-plane/shared/models/telemetry.py`	`BeaconPayload`, `TelemetryLogEntry` models
`control-plane/shared/config/settings.py`	`telemetry_signing_secret`, `license_key` settings
`control-plane/telemetry_service/reporter.py`	`_build_beacon()`, `_sign_payload()`, signed HTTP send
`control-plane/telemetry_service/main.py`	`GET /telemetry/log` endpoint
`control-plane/.env.example`	`TELEMETRY_SIGNING_SECRET`, `LICENSE_KEY` documented

6. License Key Enforcement¶

Purpose¶

Agent count limits are only meaningful if they can't be bypassed. License keys encode the customer's entitlements and SCP enforces them — at startup and on every agent registration attempt.

Threat Model¶

Threat	Mitigation
Customer exceeds licensed agent count	Cap enforced on `POST /api/agents` — returns HTTP 402 at cap
License key forged or tampered	RS256 signature verified against embedded Ohana public key
SCP started with expired license	Startup aborted with clear error message
Key rotating without redeployment	`LICENSE_PUBLIC_KEY` setting overrides embedded key

License Key Format¶

License keys are RS256-signed JWTs issued by Ohana at customer activation. SCP holds the Ohana public key (embedded at build time) to verify signatures — the private key never leaves Ohana.

JWT Claims:

Claim	Type	Description
`sub`	string	Customer identifier
`deployment_id`	string	Licensed deployment UUID
`agent_cap`	int	Max non-revoked agents (0 = unlimited)
`tier`	string	`"standard"` \| `"regulated"`
`iss`	string	Must be `"ohana-scp"`
`iat`	int	Issued-at (Unix timestamp)
`exp`	int	Expiry (Unix timestamp)

Startup Validation¶

On every API server startup:

If LICENSE_KEY is not set: log warning, continue in uncapped mode (dev only)
If set: validate RS256 signature against LICENSE_PUBLIC_KEY or embedded key
Verify iss == "ohana-scp" and required claims are present
If expired: abort startup with a clear error message
If expiring in < 30 days: log warning each startup, continue

Validated claims are stored in a process-level singleton (get_license_claims()).

Agent Cap Enforcement¶

Every POST /api/agents request:

Reads agent_cap from the cached LicenseClaims (zero overhead — no DB query for the license itself)
If agent_cap > 0: counts current non-revoked agents
If current_count >= agent_cap: returns HTTP 402 with a clear upgrade message

Revoked agents do not count toward the cap — agents that are decommissioned free up capacity.

Files¶

File	Purpose
`control-plane/shared/license.py`	`validate_license()`, `check_license_at_startup()`, `LicenseClaims`, cap singleton
`control-plane/shared/config/settings.py`	`license_key`, `license_public_key` settings
`control-plane/api_server/main.py`	License validation at startup, `set_license_claims()`
`control-plane/api_server/routers/agents.py`	Agent cap enforcement in `create_agent`
`control-plane/.env.example`	`LICENSE_KEY`, `LICENSE_PUBLIC_KEY` documented

7. Admin API Authentication¶

Purpose¶

Bundle management, agent registration, and audit admin reads must be authenticated. An unauthenticated admin API allows any network-adjacent actor to register rogue agents, publish arbitrary bundles, revoke legitimate credentials, or read the full audit log.

Threat Model¶

Threat	Mitigation
Unauthenticated agent registration (rogue agent injection)	All `POST /api/agents` requests require valid `X-Admin-Key`
Unauthenticated bundle publish (malicious governance context)	All bundle registry writes require valid `X-Admin-Key`
Audit log exfiltration via stream or verify endpoints	`/audit/requests`, `/audit/stream`, `/audit/verify`, `/audit/chain/{id}` require `X-Admin-Key`
Missing or unconfigured admin key silently open	Server returns HTTP 503 if `ADMIN_API_KEY` is not set — no silent open mode
Timing attack on key comparison	`secrets.compare_digest` used for all comparisons

Scope¶

Admin-gated endpoints (require X-Admin-Key): - All bundle registry operations (/api/bundles/*) - All agent registry operations (/api/agents/*, including API key management) - Audit admin reads: GET /api/audit/verify, /api/audit/requests, /api/audit/stream, /api/audit/chain/{id}

Agent-gated endpoints (require X-API-Key, unchanged): - Context requests: POST /api/context - Output logging: POST /api/audit/output, GET /api/audit/output/{id}

Implementation¶

The require_admin_key FastAPI dependency (api_server/dependencies.py) is applied at the router level for agents.py and bundles.py, and as an explicit dependency on the four audit admin endpoints. Every request to a gated endpoint triggers the dependency before any handler logic runs.

async def require_admin_key(
    x_admin_key: Annotated[str, Header()] = "",
) -> None:
    settings = get_settings()
    if not settings.admin_api_key:
        raise HTTPException(503, "Admin API not configured")
    if not secrets.compare_digest(x_admin_key, settings.admin_api_key):
        raise HTTPException(401, "Invalid admin API key")

Key Requirements¶

Setting	Requirement	How to generate
`ADMIN_API_KEY`	Non-empty; 24+ random bytes recommended	`python -c "import secrets; print('admin_' + secrets.token_hex(24))"`

Production Secret Injection¶

Environment	Mechanism
Azure	Azure Key Vault via managed identity
AWS (Marketplace)	SSM Parameter Store
Local / dev	`.env` file — excluded from git

Files¶

File	Purpose
`control-plane/api_server/dependencies.py`	`require_admin_key` dependency
`control-plane/api_server/routers/agents.py`	`router = APIRouter(dependencies=[Depends(require_admin_key)])`
`control-plane/api_server/routers/bundles.py`	Same
`control-plane/api_server/routers/audit.py`	Explicit dependency on four admin endpoints
`control-plane/shared/config/settings.py`	`admin_api_key` setting
`control-plane/.env.pilot.template`	`ADMIN_API_KEY` documented as `[MUST CHANGE]`

Appendix: Migration History¶

Migration	Description	Date
`001_audit_integrity.sql`	Hash chain, immutability trigger, RLS, roles	2026-04-18
`002_api_key_lifecycle.sql`	API key status column (`active`/`revoked`/`expired`), `revoked_by`	2026-04-18
`003_output_logging.sql`	`agent_output_logs` table — output-side audit trail linked to context requests	2026-04-18
`004_chain_correlation.sql`	`chain_id` and `parent_request_id` on `agent_context_requests` — multi-agent chain tracing	2026-04-18

SCP Security Architecture¶

Overview¶

1. Audit Log Integrity¶

Purpose¶

Threat Model¶

Database Design¶

Database Roles¶

Immutability Enforcement¶

Hash Chain Algorithm¶

Chain Verification (GET /audit/verify)¶

Files¶

2. Agent API Key Lifecycle¶

Purpose¶

Threat Model¶

Schema Changes¶

Authentication Query¶

Key Lifecycle Events¶

Background Expiry Job¶

Files¶

3. Secrets Management¶

Purpose¶

Threat Model¶

Secret Requirements¶

Startup Validation¶

Docker Compose¶

Production Secret Injection¶

Local .env Update Required¶

Files¶

4. Transport Security¶

Purpose¶

Threat Model¶

Architecture¶

Caddy Configuration¶

Database SSL¶

Redis¶

Environment Variables¶

Files¶

5. Telemetry Beacon Security¶

Purpose¶

Threat Model¶

Beacon Payload¶

Payload Signing¶

Customer Inspection (GET /telemetry/log)¶

Files¶

6. License Key Enforcement¶

Purpose¶

Threat Model¶

License Key Format¶

Startup Validation¶

Agent Cap Enforcement¶

Files¶

7. Admin API Authentication¶

Purpose¶

Threat Model¶

Scope¶

Implementation¶

Key Requirements¶

Production Secret Injection¶

Files¶

Appendix: Migration History¶

Chain Verification (`GET /audit/verify`)¶

Local `.env` Update Required¶

Customer Inspection (`GET /telemetry/log`)¶