GOVERN-1.3 / MAP-1.1 / MEASURE-2.1 NIST AI Risk Management Framework

Documented model cards / system cards

Demonstrate that the organization maintains current, comprehensive model cards or system cards for AI/ML systems in production that accurately document capabilities, limitations, training data provenance, performance metrics, intended use, and known failure modes.

Description

What this control does

Model cards and system cards are structured documentation artifacts that describe the capabilities, limitations, training data, performance characteristics, intended use cases, and potential biases of AI/ML systems. Model cards focus on individual models, while system cards encompass broader AI-enabled systems including data pipelines, human-in-the-loop components, and deployment context. These cards provide transparency to operators, auditors, and downstream consumers about what the AI system does, how it was developed, where it performs well or poorly, and what risks it carries. They serve as both operational reference material and accountability documentation for governance purposes.

Control objective

What auditing this proves

Associated risks

Risks this control addresses

Deployment of AI models outside their intended operational design domain, causing incorrect or harmful outputs in unanticipated contexts
Inability to trace unexpected model behavior back to training data characteristics, bias sources, or architectural decisions during incident response
Unauthorized or inappropriate reuse of AI models by downstream teams or external partners who lack visibility into model limitations and contraindications
Failure to communicate known algorithmic bias or fairness concerns to operators, resulting in discriminatory outcomes that violate regulatory or ethical standards
Loss of institutional knowledge when AI system developers leave the organization, rendering models unmaintainable black boxes
Regulatory non-compliance in jurisdictions requiring algorithmic transparency, explainability, or impact assessments for automated decision systems
Model drift or degradation going undetected because baseline performance characteristics and expected behavior were never formally documented

Testing procedure

How an auditor verifies this control

Obtain an inventory of all AI/ML models and AI-enabled systems currently deployed in production or pre-production environments.
Request model cards or system cards for a risk-based sample of systems, prioritizing those used for high-impact decisions (e.g., access control, fraud detection, content moderation, customer-facing automation).
Verify each card includes mandatory elements: model architecture, training data description, performance metrics (accuracy, precision, recall, fairness metrics), intended use cases, known limitations, and out-of-scope applications.
Cross-reference card contents against model training logs, validation reports, and data lineage documentation to confirm accuracy and currency of documented characteristics.
Interview model owners and operational users separately to confirm their understanding of documented limitations matches the card contents and that cards are actually consulted during deployment decisions.
Review version control or change management records to verify cards are updated when models are retrained, fine-tuned, or deployed to new contexts.
Examine incident post-mortems or problem tickets related to AI system failures to determine whether inadequate documentation contributed to the issue or delayed resolution.
Assess accessibility and discoverability of cards by attempting to locate them through internal documentation systems, and verify appropriate stakeholders (security, compliance, product management) have read access.

Evidence required Collect copies of model cards or system cards for sampled AI systems, including version history and approval records. Obtain screenshots of documentation repositories showing card location and access controls. Gather training logs, data provenance reports, and performance evaluation results referenced in the cards to substantiate documented claims. Capture interview notes from model developers and operational users confirming card usage and accuracy. Collect change management tickets showing card updates tied to model retraining or deployment changes.

Pass criteria All sampled production AI/ML systems have current, comprehensive model cards or system cards that accurately reflect their capabilities, limitations, training data, performance characteristics, and intended use, are version-controlled, are updated when systems change, and are accessible to relevant stakeholders including operators, security teams, and compliance personnel.

Where this control is tested

Audit programs including this control

AI Model Risk & Governance

Whether you train, fine-tune or just consume models, you need governance — inventory, risk classification, human oversight, evaluation.