Skip to content

AWS Deployment Guide

This guide walks through deploying SCP on AWS using CloudFormation.

Prerequisites

Before starting, ensure you have:

  1. AWS Account with permissions to create:
  2. VPC, Subnets, Internet Gateway, NAT Gateway
  3. EC2 instances, Security Groups
  4. Application Load Balancer
  5. SSM Parameter Store entries
  6. S3 buckets
  7. IAM roles and policies
  8. CloudWatch resources

  9. Domain and SSL Certificate:

  10. Domain name for SCP (e.g., scp.example.com)
  11. ACM certificate for the domain (in the deployment region)

  12. EC2 Key Pair:

  13. Create a key pair in your target region for SSH access

  14. AWS CLI configured with appropriate credentials

Quick Start

1. Clone the Repository

git clone https://github.com/tim-mccrimmon/supervisory-control-plane.git
cd supervisory-control-plane/deployment/aws

2. Prepare Parameters

Create a parameters file:

cat > params.json <<EOF
[
  {"ParameterKey": "Environment", "ParameterValue": "production"},
  {"ParameterKey": "InstanceType", "ParameterValue": "t3.large"},
  {"ParameterKey": "KeyName", "ParameterValue": "your-key-pair-name"},
  {"ParameterKey": "SCPVersion", "ParameterValue": "0.3.0"},
  {"ParameterKey": "DomainName", "ParameterValue": "scp.example.com"},
  {"ParameterKey": "CertificateArn", "ParameterValue": "arn:aws:acm:us-east-1:123456789012:certificate/xxx"},
  {"ParameterKey": "TelemetryEndpoint", "ParameterValue": "https://telemetry.scp.example.com/v1/report"},
  {"ParameterKey": "DBPassword", "ParameterValue": "your-secure-db-password-here"},
  {"ParameterKey": "JWTSecret", "ParameterValue": "your-secure-jwt-secret-here-min-32-chars"}
]
EOF

Important: Use strong, unique passwords. Generate them with:

# DB Password
python3 -c "import secrets; print(secrets.token_urlsafe(24))"

# JWT Secret
python3 -c "import secrets; print(secrets.token_hex(32))"

3. Deploy the Stack

aws cloudformation create-stack \
  --stack-name scp-production \
  --template-body file://cloudformation/scp-stack.yaml \
  --parameters file://params.json \
  --capabilities CAPABILITY_NAMED_IAM \
  --region us-east-1

4. Wait for Completion

aws cloudformation wait stack-create-complete \
  --stack-name scp-production \
  --region us-east-1

# Get outputs
aws cloudformation describe-stacks \
  --stack-name scp-production \
  --query 'Stacks[0].Outputs' \
  --region us-east-1

5. Configure DNS

Point your domain to the ALB:

  1. Get the ALB DNS name from the stack outputs
  2. Create a CNAME or Alias record in your DNS provider:
  3. CNAME: scp.example.comscp-production-alb-xxx.us-east-1.elb.amazonaws.com
  4. Route53 Alias: Use the ALB's hosted zone ID from outputs

6. Verify Deployment

# Health check
curl https://scp.example.com/health

# Expected response:
# {"status": "healthy", "service": "api", "version": "0.3.0"}

Architecture Overview

The CloudFormation template creates:

┌─────────────────────────────────────────────────────────────┐
│                         VPC                                  │
│  ┌─────────────────┐           ┌─────────────────┐         │
│  │  Public Subnet  │           │  Public Subnet  │         │
│  │   (us-east-1a)  │           │   (us-east-1b)  │         │
│  │                 │           │                 │         │
│  │  ┌───────────┐  │           │                 │         │
│  │  │    ALB    │  │           │                 │         │
│  │  └─────┬─────┘  │           │                 │         │
│  │        │        │           │                 │         │
│  │  ┌─────┴─────┐  │           │                 │         │
│  │  │    NAT    │  │           │                 │         │
│  │  └───────────┘  │           │                 │         │
│  └─────────────────┘           └─────────────────┘         │
│                                                             │
│  ┌─────────────────┐           ┌─────────────────┐         │
│  │ Private Subnet  │           │ Private Subnet  │         │
│  │   (us-east-1a)  │           │   (us-east-1b)  │         │
│  │                 │           │                 │         │
│  │  ┌───────────┐  │           │                 │         │
│  │  │    EC2    │  │           │                 │         │
│  │  │   (SCP)   │  │           │                 │         │
│  │  └───────────┘  │           │                 │         │
│  └─────────────────┘           └─────────────────┘         │
└─────────────────────────────────────────────────────────────┘

Service Ports

Service Port Protocol Description
API Server 8000 HTTP Bundle Registry, Agent Registry
Rules Engine 8001 HTTP Business rules management
Context Service 8002 gRPC Agent context streaming
Context Service 8003 HTTP Webhook endpoint
Telemetry 8004 HTTP Usage reporting

Security Configuration

Security Groups

The template creates two security groups:

  1. ALB Security Group: Allows inbound 80 and 443 from anywhere
  2. EC2 Security Group: Allows inbound from ALB only, plus SSH from within VPC

Secrets Management

Secrets are stored in SSM Parameter Store: - /scp/production/db-password (SecureString) - /scp/production/jwt-secret (SecureString) - /scp/production/deployment-id (String)

The EC2 instance has an IAM role that can read these parameters.

Encryption

  • EBS volumes are encrypted at rest
  • S3 buckets use AES-256 encryption
  • ALB terminates TLS with ACM certificate

Monitoring

CloudWatch Alarms

The template creates two alarms:

  1. High CPU: Triggers when CPU > 80% for 5 minutes
  2. Health Check: Triggers when ALB health checks fail

Logs

Logs are sent to CloudWatch Logs: - /scp/api - API Server logs - /scp/rules - Rules Engine logs - /scp/context - Context Service logs - /scp/telemetry - Telemetry Service logs - /scp/system - System logs

Troubleshooting

SSH Access

# Get instance ID
INSTANCE_ID=$(aws cloudformation describe-stacks \
  --stack-name scp-production \
  --query 'Stacks[0].Outputs[?OutputKey==`InstanceID`].OutputValue' \
  --output text)

# Connect via SSM (recommended - no SSH key needed)
aws ssm start-session --target $INSTANCE_ID

# Or via bastion (requires SSH key)
ssh -i your-key.pem ubuntu@<bastion-ip>
ssh ubuntu@<private-ip>

Service Logs

# On the instance
cd /opt/scp
docker compose logs -f api
docker compose logs -f rules
docker compose logs -f context
docker compose logs -f telemetry

Health Checks

# Check each service
curl http://localhost:8000/health  # API
curl http://localhost:8001/health  # Rules
curl http://localhost:8003/health  # Context
curl http://localhost:8004/health  # Telemetry

Common Issues

Services not starting:

# Check container status
docker compose ps

# Check logs for errors
docker compose logs --tail=100

Database connection issues:

# Verify PostgreSQL is healthy
docker compose exec postgres pg_isready -U postgres

ALB health checks failing: 1. Verify security groups allow ALB → EC2 2. Check that services are running on correct ports 3. Review service logs for errors

Costs

Estimated monthly costs (us-east-1, production):

Resource Specification Est. Cost
EC2 t3.large (on-demand) ~$60
ALB + data processing ~$20
NAT Gateway + data processing ~$35
EBS 100GB gp3 ~$8
S3 Backups (~10GB) ~$0.25
CloudWatch Logs + metrics ~$5
Total ~$130/month

Tips to reduce costs: - Use Reserved Instances for EC2 (up to 72% savings) - Use Spot Instances for non-production - Remove NAT Gateway if instances don't need outbound internet

Next Steps

After deployment:

  1. Register your first agent: See control-plane/docs/pilot/ONBOARDING.md in the repository, or use the Getting Started guide
  2. Load bundles: Use the load_bundle.py script
  3. Set up backups: Verify daily backup cron is running
  4. Configure alerts: Add SNS topics for CloudWatch alarms