Skip to main content
← All controls
SC-5 / CP-2 NIST SP 800-53 Rev 5

Capacity planning and autoscaling

Demonstrate that infrastructure capacity planning processes and autoscaling mechanisms are configured, tested, and monitored to maintain availability and performance under variable load conditions while preventing resource exhaustion.

Description

What this control does

Capacity planning and autoscaling controls ensure that systems can dynamically adjust compute, storage, and network resources to meet demand without manual intervention, while preventing resource exhaustion attacks and service degradation. This involves setting predictive baselines for normal resource consumption, configuring automatic scaling policies with upper and lower bounds, and establishing monitoring thresholds that trigger scaling events before performance degrades. The control protects availability during traffic spikes, whether legitimate or malicious, and prevents cost overruns from unbounded scaling.

Control objective

What auditing this proves

Demonstrate that infrastructure capacity planning processes and autoscaling mechanisms are configured, tested, and monitored to maintain availability and performance under variable load conditions while preventing resource exhaustion.

Associated risks

Risks this control addresses

  • Denial of service attacks overwhelming fixed-capacity resources leading to service outages
  • Resource exhaustion from legitimate traffic spikes causing application crashes or timeouts
  • Cost overruns from improperly configured autoscaling policies that scale without upper limits
  • Performance degradation during scaling events due to slow provisioning or insufficient warm-up time
  • Single points of failure where autoscaling is unavailable or misconfigured for critical components
  • Data loss or state corruption when stateful services scale down without proper session draining
  • Lateral movement opportunities created when emergency capacity is provisioned without security hardening

Live threat patterns this control mitigates:

Testing procedure

How an auditor verifies this control

  1. Obtain and review the organization's capacity planning documentation including baseline resource utilization metrics, growth projections, and scaling thresholds for production systems
  2. Collect autoscaling policy configurations from cloud platforms or orchestration tools, including minimum and maximum instance counts, scaling triggers, cooldown periods, and target utilization metrics
  3. Interview infrastructure teams to understand how capacity forecasts are developed, reviewed, and updated in response to business growth or architectural changes
  4. Select a representative sample of critical services and verify that autoscaling policies are enabled and configured with appropriate thresholds and limits
  5. Review monitoring dashboards and alerting rules to confirm that resource utilization, scaling events, and threshold breaches trigger notifications to operations teams
  6. Examine records of recent scaling events including timestamps, trigger conditions, resource provisioning times, and any incidents or performance impacts during scaling
  7. Request evidence of load testing or chaos engineering exercises that validate autoscaling behavior under simulated traffic spikes or resource exhaustion scenarios
  8. Verify that autoscaling configurations include safeguards such as maximum instance caps, cost alerts, and security baseline enforcement for newly provisioned resources
Evidence required Auditors collect infrastructure-as-code templates or cloud console screenshots showing autoscaling group configurations with defined min/max limits and scaling policies; capacity planning spreadsheets or reports with resource trends and projections; monitoring platform exports showing historical scaling events with timestamps and resource metrics; load test reports or incident postmortems demonstrating autoscaling activation and performance during demand spikes.
Pass criteria Autoscaling is configured and enabled for all critical services with documented thresholds, tested under load conditions within the past 12 months, bounded by maximum limits to prevent cost overruns, and monitored with alerts for scaling events and capacity thresholds.