The Audit Field Was Always Empty

KellerAI

KellerAI White Paper · May 2026 · Regulation & Compliance

The audit field was always empty.

On 2 August 2026 the EU AI Act stops being a paper and starts assigning penalties — to the same audit-trail gaps your engineering already has.

The EU AI Act does not invent new requirements for AI governance. It assigns legal penalties to engineering gaps the KellerAI corpus has been naming since 2025 — the permanently-empty log field, the model swap with no change record, the “observable” system whose telemetry means nothing. From 2 August 2026 the Act's transparency duties and its enforcement machinery go live, and a field that passes schema validation while staying substantively empty becomes a field that fails regulatory review. Teams that built observable-by-design are already closest to compliant; the rest have a documentation gap with a date on it.

Section 01

A deadline, not a new idea

The European Union's Artificial Intelligence Act — Regulation (EU) 2024/1689 — reaches a hard date on 2 August 2026. That date is not “full enforcement.” Under the Digital Omnibus simplification package, on which the co-legislators reached a provisional agreement on 7 May 2026, the heaviest obligations for high-risk systems listed in Annex III are pushed out to 2 December 2027, and the Article 50(2) machine-readable watermarking duty moves to 2 December 2026. But 2 August 2026 is the date the Article 50(1) transparency duties — tell a person they are talking to an AI, label synthetic media — stay live, the date the penalty regime attaches, and the date national market-surveillance authorities are due to be operational.

What the Act asks for, in the parts that matter to an engineer, is documentation: Annex IV technical files, Article 72 post-market monitoring, Article 50 transparency records. None of these is a novel governance regime. Each describes an artifact a well-instrumented system already emits. The trouble is that most systems do not emit them. One enterprise readiness survey put the share of organizations unprepared for the Act at roughly 78 percent; independent estimates of organizations with mature AI governance hover around 12 percent, and barely a quarter report having started concrete compliance activities.

The gap between those numbers and the deadline is the subject of this paper. It is, almost entirely, a documentation gap: one survey found 78 percent of organizations had no AI system inventory, 74 percent had assigned no accountable owner, and 61 percent had no process for producing the Annex IV technical file. Those are not legal failings. They are the absence of engineering evidence.

Section 02

The empty field is the compliance failure

Consider one field. A production AI system logs each decision, and the schema includes an array called obligations_referenced — the place a system is supposed to record which governing rules, policies, or duties were consulted when it acted. In the corpus that names this pattern, that array is observed permanently empty: every log line carries the field, every field is [], and the structure passes every schema check because an empty array is a valid array.

Article 72 requires providers of high-risk systems to run a documented post-market monitoring system that actively and systematically collects and analyzes performance data across the system's entire lifetime. The obligations_referenced array is exactly the kind of record that monitoring program is meant to produce. Structurally, it is present. Substantively, it is empty. The logging framework runs, the dashboard is green, the field exists — and there is nothing in it for an auditor to read. NIST's own logging guidance sets the bar at “sufficient detail for after-the-fact investigation”; an array that is always empty meets none of it.

A permanently-empty audit field passes schema validation and fails regulatory review. These are the same problem.

The thesis

This is what an auditor finds first, because it is the cheapest thing to check. The Act does not require you to install a logging framework. It requires that what the logging framework emits actually mean something.

Section 03

The obligations map to artifacts you already understand

The Act's mandatory artifacts line up, almost one for one, with engineering practices the corpus has argued are simply correct. The mapping below is the short form; the companion in-depth paper develops each row with the exact clause text and the evidence artifact it demands.

Annex IV §2 (Art. 11)

Description of the development process

A re-runnable evidence chain — the eval trail you can reproduce, not the vendor’s benchmark you adopted.

Annex IV §3

Monitoring, accuracy & human oversight

A behavioral fingerprint plus oversight logs that record what a human actually reviewed — not a checkbox.

Annex IV §6

Pre-determined changes & lifecycle

A model-swap change record: a vendor upgrade is a change to the system, and the change needs evidence.

Article 72

Post-market monitoring across lifetime

Continuous, semantically meaningful logs — the non-empty version of obligations_referenced.

Article 25

Substantial modification

Delta evidence: the before/after eval trail that lets you characterize whether a swap is substantial.

Article 50(1)

Transparency to the person interacting

An interaction-flag record — proof the AI disclosed itself — emitted and retained, not assumed.

Read the right-hand column on its own and it describes good engineering: reproducible evaluation, meaningful telemetry, a change record for every consequential change. Annex IV is not a compliance template bolted onto a working system. It is a description of what good engineering evidence looks like — written by lawyers, with penalties attached. The same equivalence holds across the older model-risk regimes the corpus already maps: SR 11-7's “ongoing monitoring” and Article 72's “throughout their lifetime” are the same requirement in two vocabularies.

Section 04

The same requirement is appearing everywhere

The EU is the furthest along, not the only one moving. In the United States the picture is less settled and worth stating carefully. Colorado's landmark Senate Bill 24-205, the first comprehensive U.S. state AI law, had its effective date stayed on 27 April 2026, and the legislature passed a narrower successor, Senate Bill 189, on 7–9 May 2026 — awaiting the governor's signature as of this writing, with a January 2027 effective date and a scope trimmed largely to notice and transparency duties. California's SB 1120, in force since January 2025, requires a licensed physician to make the final determination on health-insurance coverage decisions rather than an algorithm.

Treat the U.S. examples as direction of travel, not as a symmetrical mirror of the EU regime — the obligations differ in scope and several are unsettled. The shared signal is what matters: across jurisdictions, the law is converging on the same demand. Show that the system disclosed itself, show what a human reviewed, show what changed and what you tested when it did. That is an evidence demand, and it is the same evidence regardless of which statute names it.

Section 05

Sixteen months is build time, not a reprieve

The easy misreading of the Digital Omnibus delay is relief: the hard Annex III obligations move to December 2027, so the pressure is off. That reading is wrong. The work the Act requires — an inventory of every AI system, an accountable owner for each, a re-runnable evaluation trail, semantically meaningful telemetry, a change record for every model swap — cannot be assembled in the weeks before a deadline. It is engineering, and engineering takes the time it takes. Sixteen months is the build window, and the firms that read it as a reprieve will arrive in December 2027 exactly as unprepared as the 78 percent are today.

Three questions tell you where you stand, today, without a lawyer in the room. Do you have a current inventory of every AI system you operate, each with a named owner? When your system logs a decision, does the record carry semantically meaningful telemetry — or an array that is structurally present and substantively empty? And when a vendor ships a model upgrade, do you produce a change record with before-and-after evidence, or do you swap the model id and move on? If the honest answer to any of these is no, the gap the Act will penalize is already in your codebase.

The Act does not require a logging framework. It requires what the logging framework emits to mean something.

The companion paper, Observable by Design, or Liable by Default , maps each obligation to its specific evidence artifact, walks the Annex IV technical file section by section, and develops the model-upgrade test where Article 25 and Article 72 converge.

End of paper

↑ Back to top