EU AI Act Annex IV Technical Documentation: Complete Requirements Guide

If your AI system is classified as High-Risk under the EU AI Act, Annex IV technical documentation is not optional — it is the legal foundation of your entire compliance position. Without it, you cannot complete a conformity assessment, affix CE marking, or register in the EU AI Act database.

This guide explains all 8 mandatory Annex IV documentation items in detail, with practical examples, common gaps, and the key questions regulators will ask.

What is Annex IV Technical Documentation?

Annex IV is a schedule to the EU AI Act that lists the mandatory contents of the technical file every high-risk AI system provider must maintain. Think of it as the compliance dossier that proves — to any market surveillance authority that asks — that your AI system was designed, built, tested, and deployed in full conformity with the Act.

This documentation must be:

Comprehensive: All 8 items must be covered — there is no optional subset
Up to date: Updated whenever the system changes in a way that affects compliance
Accessible: Provided to national competent authorities within the timeframe they specify when requested
Retained for 10 years: After the system is placed on the market or put into service

Crucially, Annex IV is not a form to be filled in — it is a set of requirements that your documentation must satisfy. The format, length, and structure are for you to decide, but the substance is mandated.

Item 1: General Description of the AI System

Legal basis: Article 11 + Annex IV(1)

This item provides the foundational overview of what your system is and does. It must cover:

Intended purpose: Precisely what the system is designed to do, in what context, for what users, with what inputs and outputs
Version and date: The specific version the documentation relates to, with release date
Hardware and software requirements: The infrastructure on which the system runs; minimum and recommended specifications; cloud or on-premises; third-party dependencies
Forms of market placement: Whether distributed as a standalone product, embedded in a larger system, provided as a service (SaaS/API), or deployed on the provider’s own infrastructure
Key design trade-offs: Where choices were made between competing objectives (e.g. accuracy vs. speed, precision vs. recall, explainability vs. performance) — and how those choices were made

Practical example:

“Version 3.1, released 12 January 2026. An API-based resume screening service deployed to EU HR software integrators. Intended purpose: rank job applicants by predicted role-fitness for positions requiring 0–5 years experience in technology roles. Runs on AWS eu-west-1 (GPU-accelerated); CPU fallback available with documented accuracy reduction. Key trade-off: precision optimised over recall — the system minimises false shortlists at the cost of potentially excluding edge-case qualified candidates. Human review is therefore mandatory for all borderline-score candidates (50th–70th percentile).”

Common gaps:

Vague intended purpose language that doesn’t specify the decision being made
No documentation of hardware fallback behaviour
No record of design trade-offs

Item 2: Design Specifications and Development Process

Legal basis: Article 11 + Annex IV(2)

This item documents how the system was built, the technical architecture, and the methodological choices made:

General logic and key assumptions: The core model architecture; what the system learns from; key assumptions embedded in its design
Main classification or design choices: Why a particular model type was selected; what alternatives were considered and rejected
Trade-offs between accuracy, robustness, explainability, and fairness: Where the design consciously accepted limitations in one dimension to achieve gains in another
Development techniques: Training methodology, validation approach, hyperparameter selection, ablation studies

Practical example:

“Fine-tuned BERT-base classifier with custom classification head. Binary output (shortlist / do not shortlist) with calibrated confidence score. Architecture chosen over alternatives for: (1) superior performance on EU-language CVs; (2) open-source weights enabling full auditability of pre-training data characteristics. SHAP integration added at 8ms inference latency cost — accepted as necessary to satisfy Article 14 explainability requirements.”

Common gaps:

No explanation of why the chosen architecture was selected
Missing documentation of what training techniques were applied
No record of how hyperparameters were chosen

Item 3: Training, Validation, and Testing Data

Legal basis: Article 10 + Annex IV(3)

This is often the most complex item to document. It must cover:

Training, validation, and test sets: What data was used for each; the split rationale
Dataset provenance: Where data came from; what permissions or licences apply; geographic and temporal scope
Data characteristics: Volume, format, structure; how representative of the deployment population
Collection methodology: How data was collected, selected, and preprocessed
Annotation methodology: Who annotated data; inter-annotator agreement; dispute resolution procedure
Bias detection: What demographic groups were analysed; what proxies were examined; what disparities were found
Bias mitigation: What measures were applied; their documented effectiveness; residual disparities

Practical example:

“Training set: 2.1M anonymised CVs from EU-based recruitment platforms (2018–2025). Validation: 200K held-out. Test set: 50K curated adversarial examples including synthetically generated CVs representing underrepresented groups. Data provenance: licensed from [Platform A] and [Platform B] under GDPR data processing agreements. Annotation: role-fitness labels assigned by 12 certified HR professionals; inter-annotator agreement κ=0.81. Bias audit: gender, age (5 bands), and nationality (EU/non-EU) examined via correlation analysis. Pre-mitigation: 11% shortlist rate gap between male and female applicants. Post-mitigation: 2% gap (re-weighted training + feature removal of gender-correlated features). Documented in Bias Audit Report v1.3.”

Common gaps:

No documentation of dataset provenance or licences
Missing inter-annotator agreement metrics
Bias audit conducted but not linked to mitigation measures
GDPR lawful basis for training data not documented

Item 4: Instructions for Use

Legal basis: Article 13 + Annex IV(4)

This item is the technical specification that deployers — the organisations using your system — must receive. It must cover:

Provider identity and contact: Legal name, address, regulatory contact email
System capabilities and performance: Under what conditions the documented performance was measured; confidence intervals; known performance ceilings
Known limitations and failure modes: What inputs degrade performance; what use cases are out-of-scope; what failure modes have been identified
Input data requirements: Format, quality, and completeness requirements for the system to function as documented
Deployer oversight responsibilities: What oversight measures the deployer must implement; what information they must provide to affected persons; their incident reporting obligations
Expected lifetime and maintenance: How long the system is supported; when retraining or updates are planned; how the deployer will be notified of changes affecting compliance

Practical example:

“Instructions for Use v3.1, issued 12 January 2026. Known limitation: accuracy reduces by approximately 18% for CVs in languages other than English, German, and French. Deployers must apply mandatory manual review for all non-English-language CVs regardless of AI score. Human oversight requirement: deployers must implement a confirmation UI where reviewers actively approve or override AI recommendations — passive non-action does not constitute valid oversight. Deployers must retain override logs for 7 years.”

Common gaps:

Generic limitation descriptions not tied to specific performance data
No specification of what the deployer must technically implement for oversight
No version control — instructions not updated when system changes

Item 5: Risk Management System Documentation

Legal basis: Article 9 + Annex IV(5)

Documentation of your risk management system (covered fully in Article 9) must include:

The risk register: all identified risks, with probability, severity, and affected person estimates
Mitigation measures for each risk, with documentation of their tested effectiveness
Residual risks after mitigation — and how they are communicated to deployers
Testing procedures against identified risk scenarios

See our full guide to the EU AI Act Risk Management System for a detailed breakdown of this item.

Item 6: Human Oversight Measures

Legal basis: Article 14 + Annex IV(6)

Document the technical measures that enable effective human oversight:

What information is provided to oversight persons (confidence scores, explanations, uncertainty indicators)
How the monitoring dashboard works and what it displays
The override and disregard mechanism — technically, not just in policy
The stop function: who can activate it, how, within what timeframe
What measures address automation bias specifically

Item 7: Accuracy, Robustness, and Cybersecurity Measures

Legal basis: Article 15 + Annex IV(7)

Document:

Accuracy metrics, thresholds, and test results — including disaggregated results by relevant demographic group
Robustness test scenarios and results (erroneous inputs, adversarial inputs, out-of-distribution inputs)
Cybersecurity threat model and controls (data poisoning, model inversion, model theft, adversarial examples)
Third-party penetration test results if available

Item 8: Quality Management System and Post-Market Monitoring

Legal basis: Articles 17 and 72 + Annex IV(8)

Document:

Quality management system: Processes, roles, and responsibilities for maintaining compliance throughout the system lifecycle
Post-market monitoring plan: What performance data is collected post-deployment, at what frequency, with what review triggers
Automatic logging: What is logged, in what format, with what retention period
Serious incident reporting: The internal procedure from detection to regulatory notification within the 72-hour window

Version Control: A Requirement, Not Best Practice

Every change to your AI system that could affect its compliance status must trigger a documentation update. This includes:

Model retraining on new or updated data
Changes to the model architecture or feature set
Changes to the intended purpose or deployment context
Changes to the human oversight implementation

Maintain a change log linked to your Annex IV documentation. Each entry should record: what changed, when, who authorised it, and whether it constitutes a “substantial modification” requiring a new conformity assessment.

How Long Do You Have?

The deadline for full compliance — including completed Annex IV documentation, conformity assessment, and EU database registration — is 2 August 2026.

Given the depth of documentation required, organisations typically need 3–6 months to produce a complete and legally defensible technical file for the first time. If you haven’t started, start now.

Where to Begin

Start with a gap analysis: map what documentation you already have against the 8 Annex IV items. Items 3 (training data) and 5 (risk management) are typically the most underdeveloped in early-stage compliance efforts.

Our free EU AI Act Status Quo Assessment covers the key readiness questions for all 8 items and delivers a personalised gap report to your inbox in minutes.

For a complete, 15-page Annex IV Roadmap with practical examples for every documentation item — including a 90-day action plan for your organisation — see our Technical Documentation Roadmap.

EU AI Act Annex IV Technical Documentation: Complete Requirements Guide

What is Annex IV Technical Documentation?

Item 1: General Description of the AI System

Item 2: Design Specifications and Development Process

Item 3: Training, Validation, and Testing Data

Item 4: Instructions for Use

Item 5: Risk Management System Documentation

Item 6: Human Oversight Measures

Item 7: Accuracy, Robustness, and Cybersecurity Measures

Item 8: Quality Management System and Post-Market Monitoring

Version Control: A Requirement, Not Best Practice

How Long Do You Have?

Where to Begin

Free Status Quo Assessment

Annex IV Roadmap — €149