Skip to main content
← All controls
CM-3 / CM-2 / SA-10 NIST SP 800-53 Rev 5

Are workloads ephemeral and frequently rebuilt (vs long-lived snowflakes)?

Demonstrate that production workloads are designed for frequent replacement from immutable artifacts rather than long-term persistence with in-place modification.

Description

What this control does

Ephemeral workloads are compute instances, containers, or serverless functions that are frequently destroyed and recreated from immutable images or infrastructure-as-code templates rather than maintained as persistent, manually-configured systems. This approach eliminates configuration drift, reduces the attack surface from accumulated changes, and ensures every deployment reflects the current approved baseline. Long-lived 'snowflake' servers accumulate undocumented patches, configurations, and vulnerabilities that persist across security events, whereas ephemeral infrastructure is rebuilt on a defined cadence (daily, per deployment, or per event) ensuring known-good state.

Control objective

What auditing this proves

Demonstrate that production workloads are designed for frequent replacement from immutable artifacts rather than long-term persistence with in-place modification.

Associated risks

Risks this control addresses

  • Undetected malware or backdoors persist across incident response efforts when compromised hosts are patched rather than replaced
  • Configuration drift accumulates over time as manual changes bypass change control and infrastructure-as-code definitions diverge from reality
  • Privilege escalation artifacts and persistence mechanisms remain embedded in filesystems and registries of long-lived instances
  • Forensic investigation is hindered by lack of baseline comparison when systems have evolved over months or years without rebuild
  • Patch management failures compound as dependencies and configurations become fragile on aging systems resisting updates
  • Compliance gaps emerge when audit-time configurations differ from deployment-time baselines due to undocumented changes
  • Recovery time objectives are missed because bespoke configurations cannot be rapidly reproduced from code or images

Testing procedure

How an auditor verifies this control

  1. Obtain inventory of all production compute workloads including virtual machines, containers, Kubernetes pods, and serverless functions with creation timestamps and last rebuild dates.
  2. Review infrastructure-as-code repositories (Terraform, CloudFormation, Kubernetes manifests) to identify which workloads are defined declaratively versus provisioned manually.
  3. Select a sample of 10-15 workloads spanning critical applications and examine their deployment history to determine typical lifespan before replacement.
  4. Interview DevOps and platform engineering teams to document rebuild frequency policies, automated replacement schedules, and criteria triggering workload recreation.
  5. Review CI/CD pipeline configurations to verify automated mechanisms that trigger workload replacement on image updates, configuration changes, or scheduled intervals.
  6. Examine change management records for the past 90 days to identify any SSH/RDP sessions or manual configuration changes to running production instances.
  7. Test immutability by attempting to modify a running production workload's filesystem or configuration and verify whether such changes persist or trigger alerts and replacement.
  8. Validate that container images and VM templates are versioned, stored in registries, and include cryptographic signatures or digests ensuring provenance and immutability.
Evidence required Infrastructure-as-code repository exports with commit history, CI/CD pipeline configuration files showing automated deployment triggers, inventory reports with workload creation and rebuild timestamps, container registry or AMI catalog with image versioning and tagging policies, change management tickets and SSH/RDP audit logs for the past 90 days, interview notes documenting rebuild frequency and immutability enforcement, and screenshots of orchestration platforms (Kubernetes, ECS, autoscaling groups) showing replacement policies and schedules.
Pass criteria At least 90% of production workloads are rebuilt from immutable artifacts within 30 days, no unauthorized manual modifications persist on production instances, and documented policies enforce automated replacement triggered by code changes or scheduled intervals.