The Fallback Is What Licenses the Autonomy
Start with the aviation doctrine, because it states the principle in its purest form. A twin-engine airliner is not permitted to fly a transoceanic route because a regulator judged it broadly safe. It is permitted because, at every point on the route, an adequate alternate airport stays within a defined diversion budget — the maximum time the aircraft could be from a runway with a single engine inoperative, in still air. The tier names, ETOPS-120 through ETOPS-370, are exactly those budgets in minutes. The autonomy to fly far from land is sold by the minute, and the price is the guaranteed reachability of the fallback. Take the diversion airport away and the autonomy evaporates with it: the same aircraft, the same engines, the same crew, are barred from the route the instant no runway is reachable within budget. The fallback does not merely improve the operation. It is the thing that authorizes it.
The substitution into medicine is exact. The diversion airport is a reachable clinician — or, more generally, a reachable human escalation path — for any action whose consequence crosses a harm threshold. An autonomous clinical system is licensed to act on its own precisely as far as a qualified human stays reachable when its recommendation reaches the harm tier where a wrong call cannot be cheaply undone. The envelope of unsupervised action is bounded by the reachability of the fallback, exactly as the over-water envelope is bounded by the reachability of the runway.
This reframes the human-in-the-loop debate. The question is not whether a clinician is involved at all — a binary that invites the lazy answer of removing the human as the model improves. The question is the same one aviation asks of the runway: is the fallback reachable when it is needed, and at what consequence tier does it become mandatory? A read-only surfacing of a suggestion the clinician will weigh before acting needs no standing escalation, because the human is already the actor. An irreversible, high-consequence action — a dosing instruction that will be administered, a stop-treatment call — needs the clinician reachable as a hard precondition, because the action crosses the threshold past which the model's error cannot be recalled. Between those poles sits a lattice of consequence, and the reachability requirement scales up it.
A twin flies far from land only because a runway stays reachable the whole way. A clinical AI acts on its own only as far as a clinician stays reachable when the action crosses the harm tier. The fallback is not the limit on autonomy — it is the license for it.
The Law Already Drew the Boundary
Congress wrote the reachable-clinician principle into law before the current wave of clinical AI existed. The 21st Century Cures Act amended the Federal Food, Drug, and Cosmetic Act to exclude certain clinical decision support software from the definition of a medical device — and therefore from device regulation — provided it meets four criteria. The first three concern what the software consumes and produces. The fourth is the one that matters, and it is the reachability principle in statutory form: non-device CDS must give the healthcare professional sufficient information about the basis of its recommendation that the professional can independently review it, and must not be intended for the professional to rely primarily on the recommendation in making a clinical decision.
Read that as an autonomy boundary. Software that explains itself well enough for a clinician to check the reasoning — and that is positioned as an input the clinician weighs rather than a verdict the clinician defers to — stays assistive and outside device regulation. The instant the software makes the call, or makes its basis unreviewable so the clinician must rely primarily on it, it crosses into being a regulated device. The reviewable basis is the reachability of the human override: a recommendation whose rationale a clinician can independently inspect keeps the clinician in a position to divert — to reject, to modify, to escalate. A recommendation the clinician must take on faith has foreclosed the diversion. The Cures Act drew the device boundary at exactly the line past which the human can no longer reach a different answer.
The shape of the deployed field reflects this boundary. Of the many hundreds of AI-enabled devices the FDA has authorized, the overwhelming majority were cleared as assistive software through the 510(k) pathway rather than as autonomous diagnostic devices — the market has clustered, so far, on the side of the line where the clinician stays the decider. An engineering team building clinical AI can read the four criteria as a design specification, not merely a classification test: keep the basis reviewable, keep the clinician the decider above the harm threshold, and the autonomy you ship is autonomy with a reachable runway built into its interface.
A recommendation a clinician can independently check leaves the clinician able to divert. A recommendation taken on faith has foreclosed the diversion. The Cures Act drew the device boundary at exactly the line past which the human can no longer reach a different answer.
One honest qualification belongs here, because the override has to be real and not nominal. A clinician technically able to review a recommendation but who, in practice, reflexively dismisses or rubber-stamps it has a fallback in form but not in function. The literature is blunt: physicians override the large majority of drug-interaction alerts, with acceptance falling further with each additional alert in an encounter. A diversion airport that exists on the chart but is fogged in is not a diversion airport. Designing for genuine reachability means designing against alert fatigue and automation bias as much as for the presence of a review step.
Safety Is a Lifecycle, Not a Launch Gate
The second half of the license is continuous proof, and the device world codified it long before clinical AI arrived. ISO 14971, the standard that governs risk management for every device sold into a regulated market, does not stop at clearance. It mandates a production and post-production phase: once a device is on the market, the manufacturer must actively collect information from the field — adverse events, performance data, the actual experience of use — and feed it back into the risk-management file, re-evaluating whether the risks estimated before launch still hold and whether new ones have appeared. Risk management under ISO 14971 is a closed loop that runs for the life of the device, not a dossier assembled once and filed. IEC 62304, the companion software-lifecycle standard, supplies the change-control machinery the loop needs: every change to released software is handled under a controlled procedure with documentation, verification, and configuration management, so a fix or an update cannot silently alter the safety posture of the deployed system.
This is the diversion-airport doctrine applied to evidence rather than to escalation. An ETOPS approval is not a certificate earned once; it is contingent on a continuing reporting system that tracks the world fleet's reliability on a rolling basis and can contract the authority if the measured failure rate drifts upward. ISO 14971's post-production loop is the same instrument: a standing obligation to keep watching, with the authority to revise the safety case when the field data diverges from the launch assumptions. A clinical AI whose risk file was completed at clearance and never reopened is the medical equivalent of an airline that flew its ETOPS hours once, framed the certificate, and stopped tracking the fleet rate.
The reframe ISO 14971 forces onto clinical AI is that a model's safety is not a property established at validation and thereafter assumed; it is a measured quantity that decays. The mechanism of decay has a name in the clinical literature — dataset shift, the gradual divergence between the population a model was validated on and the population it now serves, as case mix, documentation practice, and care patterns move underneath it. A model validated at one health system can degrade silently at another, or at the same one a year later, with not a single weight changing. The launch gate proves the system was safe once. The lifecycle proves it still is.
ETOPS authority is contingent on a continuing fleet-reliability report that can contract it on drift. ISO 14971 demands the same post-market loop for a device. Safety established at clearance does not stay established — only continuous measurement keeps the proof current.
The Trace Is the Surveillance; the PCCP Is the Envelope
A lifecycle obligation is empty without a mechanism to feed it. The device world built one decades ago: mandatory adverse-event reporting under 21 CFR Part 803, flowing into the public MAUDE database — the standing record that turns scattered field incidents into a queryable account of how deployed devices actually fail. The structural lesson for clinical AI is that post-market surveillance requires a substrate, and that substrate is an engineering artifact, not a compliance afterthought. The assurance framework names exactly such a substrate as a cross-cutting obligation: OBL-TRC-001 — the append-only, tamper-evident decision trace. Every consequential action an agent commits is written to an append-only sink the actor cannot rewrite, gated out-of-process so the system cannot suppress its own record, with the evidence bundled and signed.
The decision trace is, in the clinical setting, the post-market surveillance database for the AI itself. Where MAUDE records a device adverse event only after a human files a report — a passive system with documented reporting biases — the trace captures every action at the moment it is committed, by construction, including the ones no one would have thought to report. That makes the escape rate measurable: the rate at which a wrong, consequential output reached a decision-maker without a warning can be computed from the trace and tracked over time. The trace must be append-only and out of the system's own reach for the same reason MAUDE is an external public database rather than a private vendor log — a surveillance record the watched system can edit is not surveillance; it is a system grading its own field performance.
The hardest problem in governing a learning system is change. A model validated once and frozen is governable by the classical apparatus, but the whole value of a learning system is that it improves — and every improvement invalidates the validation that preceded it. Re-clearing the device on every update is so slow it would either freeze the model or push the changes outside the regulatory perimeter. The FDA's answer, finalized in December 2024, is the Predetermined Change Control Plan: a manufacturer specifies, in advance and as part of the original authorization, the modifications it intends to make after clearance, the protocol by which they will be developed and verified, and an assessment of their impact. Changes inside the authorized envelope ship without a new submission; changes outside it still require one.
Read the structure carefully, because it is the same object as the ETOPS envelope. The authority to change the model is pre-bounded — the envelope of permitted change is drawn before any change is made. It is monitored — the protocol fixes the verification each change must pass and the performance it must maintain. And it contracts — a change that drifts outside the envelope, or a performance signal that breaches the protocol's bounds, forces the change back into full review. The runtime obligation that mirrors it, OBL-AGG-001 — error-correlation bounding, contracts the operating envelope automatically as the measured escape or adverse-event rate rises. Both refuse the two failure modes at the extremes — re-asserting authority on every change, which is unworkable, and granting unbounded authority once, which is unsafe — and both land on the same middle: a pre-specified envelope, continuously monitored, that contracts when the evidence turns.
The PCCP fixes the envelope of permitted change once, monitors every change against it, and contracts the authority the moment a change or a performance signal leaves the envelope. The right to change the model is pre-authorized and revocable — never re-litigated update by update, never granted without bounds.
The Posture Before Acting
Before deploying an autonomous clinical AI, the operator's posture should be set by the diversion-airport doctrine, and it collapses to three commitments mapped to the assurance obligations. Keep the clinician reachable for high-harm actions. Tier every action by consequence and make a qualified human a hard precondition at the irreversible, high-consequence top of the lattice — OBL-HUM-001 — human approval at CT4 — and treat unknown reversibility as irreversible, abstaining and escalating rather than acting on an unverifiable call — OBL-IRR-001 — irreversibility/abstention default. Reachability has to be genuine, not nominal: a saturated clinician buried under low-yield alerts is a fogged-in runway, so design the clinician's attention budget, not just the existence of a review step.
Make the decision trace your post-market surveillance. Write every consequential action to an append-only, tamper-evident record the system cannot rewrite, gated out-of-process — OBL-TRC-001 — append-only decision trace — and use it as the substrate the ISO 14971 lifecycle loop consumes, turning the field escape rate into a measured quantity instead of a filed anecdote. The device world built mandatory reporting and MAUDE for exactly this; the runtime analogue is active rather than passive, capturing what would never have been reported. Attribute every action to its model and vendor — OBL-VEN-001 — vendor/model attribution — so the field failure rate has a responsible owner, exactly as the device and banking regimes refuse to let the deploying institution outsource the duty to govern.
Contract authority the moment harm signals rise. Pre-bound the envelope of permitted change the way a PCCP does — fixed in advance, monitored against its protocol, revocable on drift — and let the runtime envelope contract automatically as the measured escape rate climbs — OBL-AGG-001 — error-correlation bounding. The right to keep operating autonomously is earned continuously and surrendered the moment the evidence turns, the way an ETOPS tier is contracted on a rising fleet rate. Medicine is the sharpest test of this discipline because it is the domain where being confidently wrong is lethal and irreversible — where the administered drug cannot be unadministered. That is exactly why the autonomy must be licensed by its fallback, not excused by its accuracy. The wrong question is how good does the model have to be to remove the clinician. The right question is is the clinician reachable when the action crosses the harm threshold, is the system still proving itself safe in the field, and does the authority contract the moment the evidence turns.
This brief is the short version. The Clinician Is the Diversion Airport — in depth carries the full argument: the Cures Act CDS criteria, ISO 14971 and IEC 62304, 21 CFR Part 803 and MAUDE, the December 2024 PCCP read against the ETOPS envelope, the NZ7571 forecast-and-foreclosed-fallback case, the five LAAS obligations mapped to the clinic, the honest limits, and the citations. It is the capstone of a three-part stack on clinical-AI governance. The first, Intended Use Is the Envelope , establishes that a model's indication is its operating boundary. The second, Risk Is Measured in Harm, Not Accuracy , establishes the governing variable inside that envelope. This third article explains what licenses the autonomy those two bounded. It is the clinical instantiation of the always-a-runway sibling, Priced in Failure-Rate Data , which contracts authority on drift in domains where being confidently wrong is expensive rather than lethal.
End of brief
↑ Back to top