Are workloads ephemeral and frequently rebuilt (vs long-lived snowflakes)?
Demonstrate that production workloads are designed for frequent replacement from immutable artifacts rather than long-term persistence with in-place modification.
Description
What this control does
Ephemeral workloads are compute instances, containers, or serverless functions that are frequently destroyed and recreated from immutable images or infrastructure-as-code templates rather than maintained as persistent, manually-configured systems. This approach eliminates configuration drift, reduces the attack surface from accumulated changes, and ensures every deployment reflects the current approved baseline. Long-lived 'snowflake' servers accumulate undocumented patches, configurations, and vulnerabilities that persist across security events, whereas ephemeral infrastructure is rebuilt on a defined cadence (daily, per deployment, or per event) ensuring known-good state.
Control objective
What auditing this proves
Demonstrate that production workloads are designed for frequent replacement from immutable artifacts rather than long-term persistence with in-place modification.
Associated risks
Risks this control addresses
- Undetected malware or backdoors persist across incident response efforts when compromised hosts are patched rather than replaced
- Configuration drift accumulates over time as manual changes bypass change control and infrastructure-as-code definitions diverge from reality
- Privilege escalation artifacts and persistence mechanisms remain embedded in filesystems and registries of long-lived instances
- Forensic investigation is hindered by lack of baseline comparison when systems have evolved over months or years without rebuild
- Patch management failures compound as dependencies and configurations become fragile on aging systems resisting updates
- Compliance gaps emerge when audit-time configurations differ from deployment-time baselines due to undocumented changes
- Recovery time objectives are missed because bespoke configurations cannot be rapidly reproduced from code or images
Testing procedure
How an auditor verifies this control
- Obtain inventory of all production compute workloads including virtual machines, containers, Kubernetes pods, and serverless functions with creation timestamps and last rebuild dates.
- Review infrastructure-as-code repositories (Terraform, CloudFormation, Kubernetes manifests) to identify which workloads are defined declaratively versus provisioned manually.
- Select a sample of 10-15 workloads spanning critical applications and examine their deployment history to determine typical lifespan before replacement.
- Interview DevOps and platform engineering teams to document rebuild frequency policies, automated replacement schedules, and criteria triggering workload recreation.
- Review CI/CD pipeline configurations to verify automated mechanisms that trigger workload replacement on image updates, configuration changes, or scheduled intervals.
- Examine change management records for the past 90 days to identify any SSH/RDP sessions or manual configuration changes to running production instances.
- Test immutability by attempting to modify a running production workload's filesystem or configuration and verify whether such changes persist or trigger alerts and replacement.
- Validate that container images and VM templates are versioned, stored in registries, and include cryptographic signatures or digests ensuring provenance and immutability.