Skip to content

SCP Update Guide

This guide covers updating SCP to new versions.

Overview

SCP uses a self-service update model: - Customers control when updates happen - Updates include automatic rollback on failure - Pre-update backups are created by default - Zero-downtime is the goal (brief service restart)

Update Process

1. Check Current Version

# Via API
curl http://localhost:8000/health | jq .version

# Via telemetry service
curl http://localhost:8004/health | jq .version

# Via Docker
docker compose exec api python -c "from shared.config.settings import get_settings; print(get_settings().scp_version)"

2. Check for Updates

# Check if update available
/opt/scp/update.sh --check

# Or check release page
# https://github.com/tim-mccrimmon/supervisory-control-plane/releases

3. Review Release Notes

Before updating, review the release notes: - Breaking changes - New features - Bug fixes - Migration requirements

4. Create Backup

# Backup is automatic, but you can force one
/opt/scp/backup.sh

5. Perform Update

# Update to specific version
/opt/scp/update.sh 0.3.1

# Skip backup (not recommended)
/opt/scp/update.sh 0.3.1 --no-backup

6. Verify Update

# Check version
curl http://localhost:8000/health

# Check all services
docker compose ps

# Run health checks
curl http://localhost:8000/health
curl http://localhost:8001/health
curl http://localhost:8003/health
curl http://localhost:8004/health

Update Script Details

The update script (/opt/scp/update.sh) performs:

  1. Pre-flight checks
  2. Validates version argument
  3. Checks if already on target version

  4. Backup (unless --no-backup)

  5. Creates full backup to local + S3

  6. Pull new images

  7. Downloads from GHCR
  8. Fails fast if image not found

  9. Update configuration

  10. Updates SCP_IMAGE in .env
  11. Updates SCP_VERSION in .env
  12. Backs up old .env

  13. Rolling update

  14. Stops services gracefully (30s timeout)
  15. Starts services with new images

  16. Health checks

  17. Waits for services to start
  18. Checks each service endpoint
  19. Rolls back on failure

  20. Cleanup

  21. Removes old Docker images
  22. Removes backup .env

Rollback

Automatic Rollback

If health checks fail during update, automatic rollback:

  1. Restores previous .env
  2. Restarts services with old images
  3. Exits with error

Manual Rollback

If you need to rollback after a successful update:

# Restore from backup
/opt/scp/backup.sh --restore /opt/scp/backups/scp-backup-YYYYMMDD_HHMMSS.tar.gz

# Or manually revert
cp /opt/scp/.env.backup.0.3.0 /opt/scp/.env
docker compose up -d

Rollback to Specific Version

# Update to older version
/opt/scp/update.sh 0.3.0

Version Channels

SCP supports multiple version channels:

Channel Example Description
Exact 0.3.1 Specific version
Latest latest Latest stable (not recommended)

Recommendation: Always use exact versions in production.

Scheduling Updates

Maintenance Windows

Plan updates during low-traffic periods: - Check usage patterns in telemetry - Notify stakeholders in advance - Have rollback plan ready

Update Notifications

The telemetry service receives update notifications:

# Check for update notifications
curl http://localhost:8004/health | jq .update_available

Database Migrations

Some updates include database migrations:

Automatic Migrations

Most migrations run automatically via schema files: - New tables are created - New columns are added - Indexes are updated

Manual Migrations

Occasionally, manual intervention is needed:

# Check for migration notes in release
cat /opt/scp/MIGRATION.md

# Run manual migration
docker compose exec postgres psql -U postgres -d control_plane -f /migrations/0.3.1.sql

Troubleshooting Updates

Image Pull Fails

Symptoms: ERROR: Failed to pull image

Solutions: 1. Check internet connectivity 2. Verify GHCR is accessible 3. Check image tag exists: docker pull ghcr.io/yourcompany/scp-services:0.3.1

Services Don't Start

Symptoms: Health checks fail after update

Solutions: 1. Check logs: docker compose logs --tail=100 2. Verify .env configuration 3. Check database connectivity 4. Rollback if needed

Database Migration Fails

Symptoms: Service errors about missing columns/tables

Solutions: 1. Check migration scripts ran 2. Run migrations manually 3. Contact support with error details

Partial Update

Symptoms: Some services on old version

Solutions:

# Force recreate all services
docker compose down
docker compose pull
docker compose up -d

Best Practices

  1. Test in staging first
  2. Deploy to staging environment
  3. Run full test suite
  4. Verify no regressions

  5. Read release notes

  6. Check for breaking changes
  7. Understand new features
  8. Review security updates

  9. Schedule appropriately

  10. Avoid peak usage times
  11. Coordinate with team
  12. Have support available

  13. Monitor after update

  14. Watch logs for errors
  15. Check metrics for anomalies
  16. Verify telemetry is reporting

  17. Document the update

  18. Record version change
  19. Note any issues
  20. Update runbooks if needed

Version Support Policy

Version Support Status
0.3.x Current (full support)
0.2.x Maintenance (security only)
0.1.x End of life
  • Current: Full support, all updates
  • Maintenance: Security fixes only
  • End of life: No support, upgrade required

Getting Help

If you encounter issues during updates:

  1. Check logs: docker compose logs
  2. Review this guide
  3. Check GitHub Issues
  4. Contact support with:
  5. Current version
  6. Target version
  7. Error messages
  8. Logs