SCP Update Guide¶
This guide covers updating SCP to new versions.
Overview¶
SCP uses a self-service update model: - Customers control when updates happen - Updates include automatic rollback on failure - Pre-update backups are created by default - Zero-downtime is the goal (brief service restart)
Update Process¶
1. Check Current Version¶
# Via API
curl http://localhost:8000/health | jq .version
# Via telemetry service
curl http://localhost:8004/health | jq .version
# Via Docker
docker compose exec api python -c "from shared.config.settings import get_settings; print(get_settings().scp_version)"
2. Check for Updates¶
# Check if update available
/opt/scp/update.sh --check
# Or check release page
# https://github.com/tim-mccrimmon/supervisory-control-plane/releases
3. Review Release Notes¶
Before updating, review the release notes: - Breaking changes - New features - Bug fixes - Migration requirements
4. Create Backup¶
# Backup is automatic, but you can force one
/opt/scp/backup.sh
5. Perform Update¶
# Update to specific version
/opt/scp/update.sh 0.3.1
# Skip backup (not recommended)
/opt/scp/update.sh 0.3.1 --no-backup
6. Verify Update¶
# Check version
curl http://localhost:8000/health
# Check all services
docker compose ps
# Run health checks
curl http://localhost:8000/health
curl http://localhost:8001/health
curl http://localhost:8003/health
curl http://localhost:8004/health
Update Script Details¶
The update script (/opt/scp/update.sh) performs:
- Pre-flight checks
- Validates version argument
-
Checks if already on target version
-
Backup (unless
--no-backup) -
Creates full backup to local + S3
-
Pull new images
- Downloads from GHCR
-
Fails fast if image not found
-
Update configuration
- Updates SCP_IMAGE in .env
- Updates SCP_VERSION in .env
-
Backs up old .env
-
Rolling update
- Stops services gracefully (30s timeout)
-
Starts services with new images
-
Health checks
- Waits for services to start
- Checks each service endpoint
-
Rolls back on failure
-
Cleanup
- Removes old Docker images
- Removes backup .env
Rollback¶
Automatic Rollback¶
If health checks fail during update, automatic rollback:
- Restores previous .env
- Restarts services with old images
- Exits with error
Manual Rollback¶
If you need to rollback after a successful update:
# Restore from backup
/opt/scp/backup.sh --restore /opt/scp/backups/scp-backup-YYYYMMDD_HHMMSS.tar.gz
# Or manually revert
cp /opt/scp/.env.backup.0.3.0 /opt/scp/.env
docker compose up -d
Rollback to Specific Version¶
# Update to older version
/opt/scp/update.sh 0.3.0
Version Channels¶
SCP supports multiple version channels:
| Channel | Example | Description |
|---|---|---|
| Exact | 0.3.1 |
Specific version |
| Latest | latest |
Latest stable (not recommended) |
Recommendation: Always use exact versions in production.
Scheduling Updates¶
Maintenance Windows¶
Plan updates during low-traffic periods: - Check usage patterns in telemetry - Notify stakeholders in advance - Have rollback plan ready
Update Notifications¶
The telemetry service receives update notifications:
# Check for update notifications
curl http://localhost:8004/health | jq .update_available
Database Migrations¶
Some updates include database migrations:
Automatic Migrations¶
Most migrations run automatically via schema files: - New tables are created - New columns are added - Indexes are updated
Manual Migrations¶
Occasionally, manual intervention is needed:
# Check for migration notes in release
cat /opt/scp/MIGRATION.md
# Run manual migration
docker compose exec postgres psql -U postgres -d control_plane -f /migrations/0.3.1.sql
Troubleshooting Updates¶
Image Pull Fails¶
Symptoms: ERROR: Failed to pull image
Solutions:
1. Check internet connectivity
2. Verify GHCR is accessible
3. Check image tag exists: docker pull ghcr.io/yourcompany/scp-services:0.3.1
Services Don't Start¶
Symptoms: Health checks fail after update
Solutions:
1. Check logs: docker compose logs --tail=100
2. Verify .env configuration
3. Check database connectivity
4. Rollback if needed
Database Migration Fails¶
Symptoms: Service errors about missing columns/tables
Solutions: 1. Check migration scripts ran 2. Run migrations manually 3. Contact support with error details
Partial Update¶
Symptoms: Some services on old version
Solutions:
# Force recreate all services
docker compose down
docker compose pull
docker compose up -d
Best Practices¶
- Test in staging first
- Deploy to staging environment
- Run full test suite
-
Verify no regressions
-
Read release notes
- Check for breaking changes
- Understand new features
-
Review security updates
-
Schedule appropriately
- Avoid peak usage times
- Coordinate with team
-
Have support available
-
Monitor after update
- Watch logs for errors
- Check metrics for anomalies
-
Verify telemetry is reporting
-
Document the update
- Record version change
- Note any issues
- Update runbooks if needed
Version Support Policy¶
| Version | Support Status |
|---|---|
| 0.3.x | Current (full support) |
| 0.2.x | Maintenance (security only) |
| 0.1.x | End of life |
- Current: Full support, all updates
- Maintenance: Security fixes only
- End of life: No support, upgrade required
Getting Help¶
If you encounter issues during updates:
- Check logs:
docker compose logs - Review this guide
- Check GitHub Issues
- Contact support with:
- Current version
- Target version
- Error messages
- Logs