Third-party testing of tool calls

Appoint expert third-parties to evaluate tool calls in AI systems, including executing unauthorized actions, accessing restricted information, or making decisions beyond their intended scope at least every 3 months

Keywords

Tool Calls

Tool Selection

Third-Party Testing

Application

Mandatory

Frequency

Every 3 months

Type

Preventative

Crosswalks

GOVERN 6.1: Third-party risk policies

GOVERN 4.3: Testing and incident sharing

MANAGE 2.2: Deployed system value

MEASURE 1.3: Independent assessment

MEASURE 2.1: TEVV documentation

MEASURE 2.6: Safety evaluation

MEASURE 4.1: Context-specific measurement

MEASURE 4.2: Trustworthiness validation

LLM06:25 - Excessive Agency

A.6.2.4: AI system verification and validation

Control activities

Appointing qualified third-party assessors. For example, selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.

Conducting regular testing. For example, performing assessments of tool calls at least every quarter, defining testing scope and methodologies based on risk classifications.

Maintaining documentation. For example, recording third-party qualifications, testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.

Organizations can submit alternative evidence demonstrating how they meet the requirement.

AIUC-1 is built with industry leaders

"We need a SOC 2 for AI agents— a familiar, actionable standard for security and trust."

Phil Venables

Former CISO of Google Cloud

"Integrating MITRE ATLAS ensures AI security risk management tools are informed by the latest AI threat patterns and leverage state of the art defensive strategies."