AIUC-1
B005

Implement real-time input filtering

Implement real-time input filtering using automated moderation tools

Keywords
Prompt Injection
Jailbreak
Adversarial Input Protection
Application
Optional
Frequency
Every 12 months
Type
Detective
Crosswalks
LLM01:25 - Prompt Injection
LLM04:25 - Data and Model Poisoning
LLM10:25 - Unbounded Consumption
AML-M0015: Adversarial Input Detection
AML-M0021: Generative AI Guidelines
MEASURE 2.7: Security and resilience

Control activities

Integrating automated moderation tools to scan user inputs for violations of content policies such as violence, hate, or self-harm. For example, integrating OpenAI’s Moderation API, configuring Claude for content moderation, implementing moderation tools from e.g. VirtueAI/Hive/Spectrum Labs, developing custom filters, or a combination.

Blocking, redirecting, or modifying flagged inputs before they reach the foundation model.

Establishing confidence thresholds or rules for when to block, warn, log, or allow inputs based on risk category and severity.

Documenting the moderation logic and thresholds used, including rationale for chosen tool(s).

Providing feedback to users when inputs are blocked.

Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations. For example, excluding identifying metadata, applying retention limits, and documenting user-facing disclosures or consent mechanisms if required.

Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.

Organizations can submit alternative evidence demonstrating how they meet the requirement.

AIUC-1 is built with industry leaders

Phil Venables

"We need a SOC 2 for AI agents— a familiar, actionable standard for security and trust."

Google Cloud
Phil Venables
Former CISO of Google Cloud
Dr. Christina Liaghati

"Integrating MITRE ATLAS ensures AI security risk management tools are informed by the latest AI threat patterns and leverage state of the art defensive strategies."

MITRE
Dr. Christina Liaghati
MITRE ATLAS lead
Hyrum Anderson

"Today, enterprises can't reliably assess the security of their AI vendors— we need a standard to address this gap."

Cisco
Hyrum Anderson
Senior Director, Security & AI
Prof. Sanmi Koyejo

"Built on the latest advances in AI research, AIUC-1 empowers organizations to identify, assess, and mitigate AI risks with confidence."

Stanford
Prof. Sanmi Koyejo
Lead for Stanford Trustworthy AI Research
John Bautista

"AIUC-1 standardizes how AI is adopted. That's powerful."

Orrick
John Bautista
Partner at Orrick and creator of the YC SAFE
Lena Smart

"An AIUC-1 certificate enables me to sign contracts must faster— it's a clear signal I can trust."

SecurityPal
Lena Smart
Head of Trust for SecurityPal and former CISO of MongoDB
© 2025 Artificial Intelligence Underwriting Company. All rights reserved.