What does Article 9 of the EU AI Act require?

Article 9 requires high-risk AI systems to have a continuous risk-management system covering identification, evaluation, and mitigation of risks across the entire lifecycle. Crucially, it's a runtime obligation — the regulator can ask what risks materialised during operation.

Is my LLM agent in scope for Article 9?

Article 9 applies to high-risk AI systems as defined in Article 6 and Annex III. Most agentic workflows building real business value end up in high-risk territory: credit scoring, employment screening, public-service eligibility, critical infrastructure, healthcare diagnostics.

When does the GPAI Code of Practice become enforceable?

2 August 2026. From that date the AI Office can request evidence from any provider or deployer of a GPAI-derived agent in the EU market. Most agents using a frontier LLM are GPAI-derived.

Cornerstone

EU AI Act Article 9 in code: how to evidence risk management for ADK agents

Article 9 risk management isn't a PDF — it's a continuous runtime obligation. Here's how to evidence it for a Google ADK agent, mapped to specific BasePlugin callbacks and audit envelope fields.

By Dipankar Sarkar June 1, 2026 4 min read

eu-ai-actarticle-9adkrisk-managementauditgpai

EU AI Act Article 9 is the operational beating heart of the regulation for high-risk AI systems. It requires a continuous risk-management system — identification, evaluation, and mitigation of risks across the lifecycle, with the kicker that this is a runtime obligation. The AI Office can ask: what risks materialised during operation last quarter? Show me the evidence.

For agentic AI on Google ADK, the practical translation is six runtime decisions per tool dispatch, evidenced in your audit trail. This article walks through what those decisions are, where they attach to ADK’s callback SPI, and what the resulting audit envelope needs to contain.

The six Article 9 runtime decisions #

Article 9(2)(a)–(g) breaks down into seven prescribed activities. Six of them produce runtime evidence; the seventh (continuous documentation) is a meta-activity that the others feed.

Identification and analysis of known and reasonably foreseeable risks. For LLM agents this is the model-risk classification: every model invocation tagged with the risk category your governance team identified at design time.
Estimation and evaluation of risks under use and misuse. Outcomes monitoring. Every tool dispatch outcome lands in the audit chain; aggregated drift is detected by the eval harness.
Evaluation of other risks possibly arising from analysis of post-market monitoring data. Continuous monitoring is the substrate; the audit chain is the post-market monitoring data.
Adoption of appropriate risk-management measures. Policy plugin DENY decisions are the visible risk mitigations. Each citation maps back to the risk it mitigates.
Elimination of risks through design. Out-of-scope for runtime — but the design choices feed the model-risk tier assignment which does land in runtime.
Implementation of risk-management measures by means of testing. The eval harness’s outcomes are evidence of the testing in production.
Provision of training to those involved. Out of scope for the agent itself; this is human-process.

Where each one attaches in ADK #

The ADK plugin SPI gives you the seams:

public class RegulusPolicyPlugin extends BasePlugin {
  @Override
  public Optional<PluginContext> beforeToolCallback(
      ToolCallbackContext context) {
    PolicyDecision decision = policy.evaluate(
        context.getToolName(),
        context.getArguments(),
        context.getPrincipal()
    );

    if (decision.isDeny()) {
      audit.emit(RegulusEvent.builder()
          .decision(decision)
          .frameworkCitations(List.of(
              "eu-ai-act:Article-9.2.a",  // risk identification
              "eu-ai-act:Article-9.2.d",  // risk-management measure
              "nist-ai-rmf:MANAGE-2.1"
          ))
          .build());
      return Optional.of(context.cancel(decision.getClause()));
    }
    return Optional.empty();
  }
}

The key piece is the frameworkCitations list. Each citation is a specific clause-level reference that an auditor can trace back to the regulation text. “We logged a denial” is weak evidence. “We denied this tool call, here’s the clause text we matched against, here’s the framework function this maps to in NIST AI RMF, here’s the resolved jurisdiction” is the evidence the AI Office expects.

What the audit envelope needs to carry #

For Article 9 evidence the envelope per event needs at minimum:

ts — ISO 8601 timestamp.
agent — registered agent name (matches your model inventory).
decision — ALLOW, DENY, or REQUIRE_HITL.
clause — verbatim policy clause text that matched.
framework_citations — array of framework:control-id strings.
jurisdiction — resolved jurisdiction code.
principal — sub claim + tenant + purpose + tier.
model_id + model_tier — registered model + tier.
prev_hash + hash — hash chain link.
risk_category — the Article 9 risk category this decision addresses.

The last one is the one most hand-rolled stacks miss. Regulus tags events with the risk category from the active profile’s risk register. Filter by risk_category to produce the per-category mitigation evidence the AI Office expects.

The 90-day audit walk-through #

A real Article 9 walkthrough at month 3 of operation looks like this. The supervisory authority gives you 14 days’ notice and asks for:

All DENY events in Q1 2026. Filter the chain. Export as signed envelope.
Aggregation by risk category. regulus audit aggregate --by risk_category --period 2026-Q1. Produces a CSV.
Three specific DENY events expanded. Pick three at random; show the full envelope, the matched clause, the framework citation, the resolved principal, the chain integrity proof.
Outcomes monitoring for the period. Aggregated outcomes tagged with model_id × tier. Drift events if any.
Post-market monitoring narrative. Your governance team’s report referencing the data above.

The first four are runtime artefacts; Regulus produces them. The fifth is the narrative your governance team writes referencing the runtime artefacts.

What this doesn’t cover #

Article 9 obligates the high-risk AI system as a whole, not just the runtime layer. The training data risk-management (Article 10), the technical documentation (Annex IV), the conformity assessment by a notified body — all governance-level work. The runtime evidence Regulus emits is what those documents reference and the regulator audits.

For the GPAI Code of Practice deadline (2 August 2026), Article 9 is the principal runtime obligation. Reach back into the EU AI Act profile page for the 12-section breakdown of what the regulation requires, and at the audit plugin page for the envelope shape in detail.