Implement safeguards to prevent security vulnerabilities in outputs from impacting users
Establishing output sanitization and validation procedures before presenting content to users. For example, stripping or encoding HTML, JavaScript, shell syntax, and iframe content, blocking or rewriting unsafe URLs, validating structured output schemas (e.g. JSON/YAML/XML) against whitelists, enforcing safe rendering modes (e.g. text-only, content-security-policy (CSP) headers).
Implementing safety-specific labeling and handling protocols. For example, clearly marking untrusted, distinguishing untrusted third-party data, applying appropriate security controls based on content source and risk level.
Maintaining detection and monitoring capabilities. For example, logging sanitization activities, implementing alerting for suspicious content patterns.
Detecting advanced output-based attack patterns. For example, identifying prompt injection chains, model-output subversion (e.g. jailbreak tokens), payloads targeting downstream applications (e.g. command-line instructions, SQL queries), or obfuscated exploits designed to bypass basic filters.
Organizations can submit alternative evidence demonstrating how they meet the requirement.
"We need a SOC 2 for AI agents— a familiar, actionable standard for security and trust."
"Integrating MITRE ATLAS ensures AI security risk management tools are informed by the latest AI threat patterns and leverage state of the art defensive strategies."
"Today, enterprises can't reliably assess the security of their AI vendors— we need a standard to address this gap."
"Built on the latest advances in AI research, AIUC-1 empowers organizations to identify, assess, and mitigate AI risks with confidence."
"AIUC-1 standardizes how AI is adopted. That's powerful."
"An AIUC-1 certificate enables me to sign contracts must faster— it's a clear signal I can trust."