Always a Runway: Diversion-Airport Doctrine for Autonomous Systems

KellerAI

Section 01

The Load-Bearing Rule: Range Relative to a Safe Harbour

The popular reading of ETOPS — Extended-range Twin-engine Operational Performance Standards — is that it is a permission to fly farther over open water on two engines. That reading mistakes the consequence for the rule. The actual load-bearing rule is narrower and stranger: a twin may fly only as far as a reachable, adequate alternate airport at every point on its route, computed in the worst case — one engine inoperative, in still air, at the normal single-engine cruise speed. 1 9 The range a twin is allowed to fly is not an input to this calculation. It is an output of it. Range is a derived quantity that falls out of where the safe harbours are; it is never asserted absolutely.

The unit of the rule is time, not distance. ETOPS approvals are stated in minutes — ETOPS-120, ETOPS-180, ETOPS-330 — and the minutes refer to diversion time: the maximum one-engine-inoperative flight time, in still air at the approved OEI cruise speed, from any point on the route to an adequate alternate. 9 10 A 180-minute approval means the aircraft may never be more than 180 minutes of single-engine flying from a runway it could actually use. The route is then constructed inward from the runways: draw the diversion-time circles around every adequate alternate, and the legal corridor is the union of those circles. Fly outside it and you are, by definition, beyond the rule — not because the aircraft cannot physically reach the destination, but because at some point on that path no safe harbour was within the worst-case envelope.

This inverts the intuition that capability sets reach. ETOPS replaced the so-called 60-minute rule, codified in the route-limitation provisions of the U.S. Federal Aviation Regulations, which from the 1950s barred twin-engine airplanes from operating more than sixty minutes’ flying time from an adequate airport. 2 3 That rule kept early twins hugging coastlines and doglegging around oceans regardless of how capable the airframe was. FAA Advisory Circular AC 120-42, issued in 1985 and now maintained as AC 120-42B, opened operations beyond sixty minutes — but it did so by tightening, not loosening, the safe-harbour requirement: the farther you fly from a runway, the more you must prove about the adequacy and reachability of the runways you are relying on. 1 The first ETOPS-120 service, TWA’s Boeing 767 from Boston to Paris on 1 February 1985, did not happen because the 767 had suddenly become able to fly the Atlantic; it had always been able to. It happened because the operator could now demonstrate that an adequate alternate was always within 120 minutes of single-engine flight. 3

Range is defined relative to a guaranteed safe harbour, never asserted absolutely. The twin does not fly as far as it can; it flies as far as it can still turn back toward a runway it is certain it can use.

The inversion

It is worth dwelling on why the rule is computed in the worst case rather than the expected one, because that choice is the whole of the discipline. A twin with both engines running has range to spare; the constraint binds only in the failure case, when one engine is gone and the aircraft is slower, lower, and burning into reserves. ETOPS sizes the envelope to that case — OEI cruise speed, still air, the degraded arrival — precisely because the safe harbour is needed only when things have gone wrong, and a harbour sized to the good day is no harbour at all. 9 The all-engines envelope describes where the aircraft can go; the OEI envelope describes where it can still be caught. Safety lives entirely in the second number, and the rule refuses to let the first one stand in for it.

The reframe for autonomous systems is a single move. Every extended autonomous operation — a multi-step agent loop, a long-horizon plan, a chain of tool calls that mutates external state — is governed by the same question the ETOPS dispatcher asks before every flight: is a reachable, adequate fallback inside the worst-case envelope at this step? “How capable is the agent” is the wrong axis, exactly as “how far can the twin fly” is the wrong axis. The right axis is whether, at every point along the operation, the system still retains a reachable safe harbour — a rollback, a human handoff, a safe-default, a circuit-breaker — computed against the worst case rather than the expected one. Granted autonomy, like legal range, should be a derived quantity: it falls out of where the fallbacks are.

Section 02

Adequate vs. Reachable: Two Words That Carry the Doctrine

Two words do nearly all the work in the diversion rule, and they are not interchangeable. A safe harbour must be both adequate and reachable, and a defect in either voids the guarantee no matter how good the other looks on a chart.

Adequate is a substantive property of the alternate itself. AC 120-42B and the Part‑25 ETOPS airworthiness criteria define an adequate airport as one that, at the expected time of use, has the runway length and strength, the navigation and approach aids, the lighting, the rescue and firefighting capability, and the weather-and-NOTAM standing to actually receive the aircraft in the degraded condition in which it would arrive. 1 4 10 A dot on a map with a paved strip is not automatically adequate. If the only approach aid is unusable in the forecast ceiling, if the runway is too short for a single-engine landing weight, if the field has no firefighting cover for a widebody — then the airport exists but the alternate does not. Adequacy is judged against the state you will be in when you arrive, which is, by construction, a worse state than the one you departed in.

Reachable is a geometric property of the relationship between the aircraft and the alternate. The alternate must lie inside the single-engine-out time envelope — the diversion-time circle computed at OEI cruise speed in still air — not inside the comfortable all-engines envelope. 9 An airport 150 minutes away at normal cruise may be 200 minutes away on one engine, and if the approval is ETOPS-180 that alternate is not reachable in the case that matters. The whole discipline lives in this distinction between the happy-path envelope and the worst-case envelope. Planning to the all-engines envelope is the most common way an alternate that looks reachable turns out not to be.

Crucially, ETOPS does not require a single adequate, reachable alternate. It requires an adequate, reachable alternate at every point on the route — an alternate set whose union of diversion-time circles covers the entire path, with each member independently verified for adequacy at its expected time of use. 1 4 The set is specified against worst-case arrival states, not against the median day. If two of the three planned alternates can be foreclosed by the same regional weather system, the set has a correlated failure and is thinner than its count suggests.

The temporal qualifier in “adequacy at its expected time of use” is doing quiet but decisive work. An airport is not adequate in the abstract; it is adequate for the moment the diverting aircraft would arrive, against the weather, the NOTAMs, the available services, and the runway state forecast for that moment. 1 10 This is why an alternate that is plainly adequate at dispatch can cease to be adequate by the time it would actually be needed — and why a planning process that evaluates alternates only at their dispatch-time condition is checking the wrong instant. The whole of the NZ7571 lesson in Section 4 turns on exactly this: an alternate set evaluated against the conditions expected at dispatch, in an environment where those conditions were both volatile and likely to deteriorate, is a set verified at the wrong time. Adequacy is a property of the future arrival state, assessed under uncertainty, and that uncertainty is part of what the worst-case framing must absorb.

A fallback that exists in principle but cannot catch the failure state you will actually be in is not a fallback. An alternate must be adequate for the degraded arrival and reachable in the worst-case envelope — both, at every point, or neither counts.

The two-word test

The mapping to autonomous systems is direct and unforgiving. A “fallback” that is nominal but cannot catch the failure mode the agent will actually be in is an inadequate alternate. A rollback that cannot run because the state it would restore has already been committed; a human handoff with no human actually on call at 3 a.m.; a safe-default that itself depends on the subsystem that just failed; a circuit-breaker that trips only after the irreversible external call has gone out — these are airports on the map that cannot receive the aircraft. And a fallback that exists but lies outside the worst-case step budget — a recovery routine that needs more tokens, more latency, or more healthy dependencies than the agent will have at the moment of failure — is an alternate outside the OEI envelope. The discipline is to specify the fallback set against the degraded arrival state, verify each member for both adequacy and reachability, and check the set for correlated foreclosure — not to count fallbacks on the happy path.

Section 03

Earning the Envelope: How the Tiers Were Granted

The diversion-time tiers were not handed out as round numbers; each was earned against demonstrated reliability data. The ladder runs from ETOPS-120 (TWA’s 767, February 1985) up through ETOPS-180, granted to the Boeing 777 around its 1995 entry into service, which alone unlocked roughly ninety-five percent of the Earth’s surface for twin-engine operation. 3 5 28 Above 180 the ladder continues: ETOPS-240 on the Airbus A330, the 330-minute approval for the GE-powered 777 family (FAA type-design approval in December 2011), and ETOPS-370 for the Airbus A350. 7 11 Each rung extended the diversion-time circles, and so each rung extended legal range — but only after the gate that justified it had been satisfied.

That gate is the world-fleet in-flight shutdown (IFSD) rate. Before a propulsion system may be approved for extended operations, and to retain that approval, the demonstrated IFSD rate across the world fleet must be shown to be low and stable: the rulemaking record sets indicative targets on the order of 0.02 shutdowns per 1,000 engine-hours for operations up to 180 minutes and 0.01 per 1,000 engine-hours beyond 180 minutes, with continued monitoring “beyond 250,000 engine-hours of fleet operating experience until a stable IFSD rate” is established. 8 The number that licenses the range is not a promise about the future; it is a measured frequency drawn from accumulated operating experience. Higher autonomy — more minutes, more range — is earned by demonstrated failure-rate data, not asserted by ambition or airframe pedigree.

Two features of the gate are worth naming because they transfer cleanly. First, it is a fleet statistic, not a per-airframe one: the rate is computed across all aircraft of the type in service worldwide, so a single operator cannot earn an envelope its airframe has not collectively demonstrated, and a single good year cannot stand in for a stable record. The evidence is pooled and continuous. Second, the gate is retentive, not merely admissive: an approved type that drifts above its IFSD target can have its operations curtailed, because the approval is a standing claim about a measured rate rather than a one-time certificate. 8 The envelope is licensed for as long as the evidence holds and no longer. An autonomy regime built on the same logic would pool failure data across all deployments of an agent, not just one team’s, and would narrow the envelope automatically when the observed rate drifted — treating the grant of autonomy as a renewable lease against measured reliability, not a permanent award.

This paper keeps the IFSD-rate mechanism subordinate, because the reliability-demonstration accounting is the spine of a companion analysis, not this one. Here it serves a single, framing purpose: it shows that the range a twin is permitted to fly is licensed by evidence, not by capability claims. The diversion-airport rule says where the runways must be; the IFSD-rate gate says how confident you must be in the engines before you are allowed to lengthen the gaps between them. The two are complementary. You may not extend the envelope merely because the airframe could, in principle, glide farther; you extend it because the world fleet has demonstrated, in service, that the failure you are buying insurance against is rare and stable.

For autonomous systems, this is the bridge between two governance questions that are often conflated. “Can the agent do this task?” is a capability question. “Should the agent be permitted to run this far without a checkpoint?” is an envelope question, and the honest answer depends on a measured failure rate, not on a demo. An organization that widens an agent’s autonomy because a benchmark looked good is extending the envelope on pedigree. An organization that widens it because the agent’s observed failure rate on real traffic is low and stable — and that keeps monitoring until the rate is stable, not merely until it is low once — is extending the envelope the way ETOPS does.

Section 04

The Point of Safe Return: Anatomy of NZ7571

The doctrine’s sharpest lesson is best read not from a success but from a near-miss that ended well for reasons the system did not earn. On 7 October 2013, a Royal New Zealand Air Force (RNZAF) No. 40 Squadron Boeing 757-2K2, flight callsign NZ7571, departed Christchurch for Pegasus Field on the Ross Ice Shelf, Antarctica, with 130 people aboard — 117 passengers and 13 crew. 12 13 Because the 757 lacked the fuel to return to Christchurch without refuelling at Pegasus, a point of safe return had been pre-computed: the furthest point along the route at which the aircraft still had the fuel to turn around and reach its departure field with required reserves. 14 A point of safe return is the fuel-defined boundary of reversibility. Before it, the designed diversion — go home — remains available. After it, that diversion is foreclosed; the only way out is forward, to the destination or to an alternate near it.

The forecasters assured the crew that the weather at Pegasus would improve, and the flight was cleared past the safe-return point on that forecast. Roughly twenty minutes later, observations told a different story: a fog bank had enveloped the runway, producing near-whiteout conditions on the only field now reachable. 12 The crew flew three approaches. On the third, at approximately 110 feet, they acquired the approach lights and runway markings and landed — below the published minima for the approach. There was no damage and no injuries. 12 13 Read narrowly, this is a story of professional airmanship recovering a bad situation, and the subsequent inquiry said as much.

Read at the level of doctrine, the failure was upstream of the cockpit and earlier than the fog. The Transport Accident Investigation Commission inquiry AO-2013-009 found the crew’s in-flight decisions appropriate but identified gaps in the original risk assessment — most pointedly, the absence of 757-suitable alternate approach procedures and thin consideration of 757-suitable Antarctic aerodromes, against an early-season weather-deterioration likelihood that was under-weighted. 12 The alternate set for the 757 was, in ETOPS terms, under-specified. The designed fallback (return to Christchurch) was already foreclosed by the fuel-and-range envelope the moment the aircraft passed its point of safe return, and the remaining alternates were not robust to the very weather the early-season Antarctic operation made likely. The aircraft was committed past its rollback horizon on a forecast, into an alternate set too thin to absorb the observed conditions.

The RNZAF subsequently changed its procedures following the inquiry. 16 That is the system correcting the part it could correct: not the crew’s judgement under pressure, which performed, but the dispatch-time discipline that had let the aircraft cross an irreversible line with an inadequate set of safe harbours behind it.

The failure was not the landing. It was committing past the point of safe return on a forecast rather than an observation, with the designed fallback already foreclosed and the alternate set already too thin to catch the day’s actual weather.

The real failure

One distinction must be made explicitly, because Antarctica and whiteout invite a false association. NZ7571 is a successful recovery, not a tragedy, and it must not be confused with the 1979 Mount Erebus disaster — Air New Zealand Flight 901, a DC-10 that struck Mount Erebus and killed all 257 aboard after a navigation-coordinate error in whiteout conditions. 15 The two events share Antarctica and whiteout; they share nothing else. NZ7571 was a military flight that landed safely below minima after a dispatch-discipline gap; Flight 901 was a civilian disaster rooted in a navigation-data error. Conflating them obscures the precise lesson NZ7571 teaches, which is about reversibility and the adequacy of an alternate set, not about navigation accuracy.

Section 05

The Forecast/Observation Gap as the Real Failure Mode

Generalize NZ7571 and a clean failure mode emerges, distinct from any single bad decision. The load-bearing error is commitment past an irreversible point on a predicted world-state that subsequently diverged from the observed world-state, while the designed fallback was already foreclosed and the alternate set was under-specified. Each clause is necessary. Had the fallback not been foreclosed, the divergence would have been survivable by turning back. Had the alternate set been adequate, the divergence would have been survivable by diverting. Had the commitment been made on an observation rather than a forecast, the divergence would not have been a surprise. It was the conjunction — irreversible commitment, on a forecast, with a thin set — that turned a weather change into a below-minima landing.

The forecast-versus-observation gap is the part that connects this paper to the broader integrity discipline of the parent series. In that series’ vocabulary, the object of control is the rate of undetected false assertions — not error, but error that escapes detection and reaches a decision-maker without a warning flag. A fallback that “should” be reachable according to the forecast, but is not reachable according to the observation, is the diversion-doctrine analogue of Hazardously Misleading Information — the failure mode aviation’s software-assurance and integrity-monitoring standards exist to suppress — a safe harbour you believe you have but do not. 18 The danger is not that the weather was bad; weather is sometimes bad. The danger is that the system held a belief about its safe harbour that the world had quietly invalidated, and acted on the belief past the point where it could check.

The corrective in aviation is structural and conservative: pre-compute the point of safe return, refuse to cross it without a reachable adequate alternate in the worst-case envelope, and treat a divergence between forecast and observation as a reason to divert early, while reversibility still exists, rather than to press on in the hope the forecast was right. This is the operational form of the same instinct the statistical-learning literature formalizes as selective prediction: when the evidence supporting an assertion is below threshold, the sound move is to abstain — to decline to commit — rather than to assert and hope. 17 20 In navigation it is the missed approach; in dispatch it is the divert; in an agent it is the rollback or the escalation. In all three the discipline is the same: do not let a forecast carry you past the last point at which you could still have checked.

Section 06

Mapping to AI Agent Autonomy

The translation to autonomous agents turns on a single defined quantity: the rollback horizon. The rollback horizon is the last step in an operation before an irreversible side effect — a sent message, a money movement, a deleted resource, an external API call that commits state downstream. Before the rollback horizon, the designed fallback (undo, retry, abandon) is available. After it, that fallback is foreclosed, exactly as Christchurch was foreclosed once NZ7571 passed its point of safe return. The rollback horizon is the agent’s point of safe return, and the central discipline is to compute it explicitly, per operation, rather than discover it after the irreversible call has gone out.

The two-word test from Section 2 maps cleanly. A fallback must be adequate — it must actually catch the worst-case failure state the agent will be in — and reachable — it must remain executable from where the agent will be when it fails, inside the worst-case step budget rather than the happy-path one. Enumerate the under-specified-alternate failure patterns, and they are the agent-world versions of an airport that cannot receive the aircraft:

A rollback that cannot run because the state it would restore has already been committed is an alternate foreclosed by the very progression that needed it. A human handoff with no human on call is an airport with no firefighting cover — nominally an alternate, unable to receive the arrival. A safe-default that itself depends on the failed subsystem is an alternate fogged in by the same weather system as the destination: a correlated failure in the alternate set. A circuit-breaker that trips only after the irreversible external call has already gone out is a diversion-time circle drawn at all-engines speed — it looked reachable, but in the case that mattered the aircraft was already past it. Each of these is a fallback that exists on the happy-path diagram and evaporates in the degraded arrival state.

These patterns share a diagnostic signature: each is a fallback that is verified at the wrong time or in the wrong state. The rollback was confirmed to exist at design time but not re-confirmed to be executable at the rollback horizon; the on-call human was staffed on the org chart but not actually reachable in the failure window; the safe-default was tested in isolation but never tested under the joint failure that takes out its dependency. This is the agent-world echo of adequacy-at-expected-time-of-use from Section 2: a fallback’s adequacy is a property of the moment and state in which it would actually be invoked, not of the moment it was specified. A fallback set audited only at design time, against the happy path, is the diversion-doctrine error of evaluating alternates at dispatch-time conditions — the very gap the inquiry into NZ7571 identified. 12 The remedy is to make the audit dynamic: re-evaluate, at or before each rollback horizon, whether each member of the fallback set is still adequate and still reachable from the state the agent is actually in.

There is also direct evidence that an agent’s safe harbours degrade precisely over the long horizons where extended autonomy lives. Work on long-context agents shows that refusal and other safety mechanisms become unstable as the operation extends — the very fallback you were relying on weakens the farther you fly from where you last verified it. 21 This is the agent analogue of a fallback foreclosed past the point of safe return: the longer the autonomous run, the less you can assume the safe-default you specified at dispatch is still the safe-default in force at the rollback horizon. Production-engineering practice names the same concern from the other side, with retries, fallbacks, and circuit-breakers as first-class constructs for long-running LLM operations — the machinery for keeping a reachable adequate alternate in the envelope rather than assuming one. 19

The selective-prediction framing makes the divert decision precise rather than merely prudent. Classical work on selective classification poses exactly this problem: a predictor may, for each input, either commit to an output or abstain, and the goal is to minimize the rate of wrong commitments among the inputs it answers, subject to answering often enough to be useful. 17 20 Translate the predictor into an agent at its rollback horizon and the structure is identical: at each irreversible step the agent either commits the side effect or diverts — rolls back, escalates, abstains — and the object of control is the rate of committed-and-wrong actions among those it commits. The diversion airport is the reject option made operational; the point of safe return is the threshold at which the reject option must still be available. An agent with no abstention mechanism at its rollback horizon is a classifier forced to answer every query — which is exactly the failure mode the selective-prediction literature exists to prevent.

The prescription is therefore three commitments, parallel to the dispatcher’s. First, pre-compute the point of safe return per operation: identify the rollback horizon before the operation begins, not as it is crossed. Second, keep a reachable adequate fallback in the worst-case envelope at every step — verify that the fallback catches the degraded state and remains executable from where the agent will be when it fails, and check the fallback set for correlated foreclosure. Third, treat forecast-versus-observation divergence as a first-class divert trigger: when the world the agent observes diverges from the world its plan predicted, divert early — roll back or escalate — rather than press on toward the irreversible step on the strength of the prediction. The bounded-risk version of this trigger is exactly the conformal-abstention construction: calibrate a threshold such that the rate of committed-and-wrong actions stays below a chosen tolerance, and abstain — divert — whenever the operation’s confidence falls under it. 22

Section 07

The Cross-Discipline Cross-Reference: Banking’s Reachable-Reserve Analogue

The diversion-airport rule has a precise counterpart in financial regulation, which lets us triangulate the doctrine from a second discipline without re-deriving the parent series’ banking leg. The banking analogue of an adequate, reachable alternate is an adequate, reachable reserve. A bank may not book a risk it cannot price, and the capital and liquidity it holds against the risks it does book must be not merely nominal but callable in the stress state — reachable when the loss actually arrives, not just present on the balance sheet on a calm day. A reserve that exists on paper but cannot be mobilized in the stress scenario is the financial version of an airport that cannot receive the aircraft: adequate in name, foreclosed in fact.

The structural parallel runs deeper than the reserve metaphor. Banking, like aviation, refuses to let a system commit to a position whose downside it cannot bound — the loan it will not write because the counterparty falls outside its underwriting criteria is the financial counterpart of the route a dispatcher will not file because no adequate alternate covers a leg of it. In both, the institution declines the operation rather than booking a risk it cannot catch, and in both that declination is a disciplined act, not a failure of nerve. The agent at its rollback horizon faces the same choice in miniature: take the irreversible action whose worst case it cannot recover from, or decline it. The cross-discipline lesson is that mature, heavily-regulated fields converged independently on the same answer — do not enter a state from which there is no adequate, reachable exit — which is strong evidence that the answer is forced by the structure of irreversible commitment under uncertainty rather than by any one field’s conventions.

The governing instrument is model-risk management. SR 11-7, the Federal Reserve and OCC supervisory guidance issued in 2011, frames any consequential quantitative model under independent “effective challenge” and explicitly covers vendor and third-party models — governance is not dischargeable by procurement. 23 Its 2026 successor, SR 26-2, revises that guidance on a risk-based, materiality footing — and, tellingly, explicitly scopes generative and agentic AI out, directing institutions to apply existing model-risk practice to those systems rather than inventing a parallel regime. 24 That scoping decision makes the cross-discipline transfer this paper performs more load-bearing, not less: the regulators are signaling that the discipline already exists and should be carried over, which is precisely the move from the diversion-airport rule to the agent rollback horizon.

The reliability-gate analogue is Basel’s backtesting regime. 25 Just as ETOPS earns a higher diversion-time tier by demonstrating a low, stable world-fleet IFSD rate, a bank earns the right to use an internal risk model by demonstrating, through counted exceptions against a traffic-light scale, that the model’s stated bound holds in practice. Both are earn-the-bound-by-measured-exceptions disciplines: you do not assert the bound, you operate under measurement and accept escalating consequences when the measured exception rate drifts. The diversion circle and the capital reserve are the same kind of object — a pre-committed safe harbour sized to a worst case — and both are licensed by evidence rather than by claim.

The forecast-versus-observation gap has a banking face as well, and it sharpens the NZ7571 lesson. A capital reserve sized to a modelled stress scenario that the realized stress then exceeds is the financial analogue of a diversion airport that the forecast said would be open and the observation found fogged in. Banking’s response is not to trust the model’s scenario but to backtest it against realized outcomes and to escalate — through the traffic-light multiplier — when realized exceptions outrun the model’s prediction. 25 The discipline is the same one Section 5 drew from aviation: do not let a forecast of adequacy substitute for a measurement of it, and treat the divergence between the two as the trigger to hold more reserve, divert earlier, or commit less. The point-of-safe-return logic is, in this light, not an aviation idiosyncrasy but a recurring shape that any discipline managing irreversible commitment under uncertainty independently rediscovers.

Section 08

The Cost-Savings Inversion: Doctrine Enables Range

The most counter-intuitive payoff of the diversion-airport doctrine is that it is an enabler of wider operation, not merely a restriction on it. The economic case for twins is large and structural: a twin burns substantially less fuel than a three- or four-engine aircraft on the same mission — the International Council on Clean Transportation found four-engine transpacific aircraft roughly twenty-four percent less fuel-efficient per passenger than twins — and because engine-related expense is a dominant share of maintenance cost, fewer engines compounds the saving. 26 28 But those economics are only realizable if the twin can fly the direct over-water route. Before ETOPS, the 60-minute rule forced twins into fuel-wasting doglegs that hugged the coast to stay within an hour of a runway, surrendering much of the airframe’s efficiency advantage to the geometry of the constraint. 2 3 ETOPS-180 unlocking roughly ninety-five percent of the Earth’s surface is exactly the statement that the diversion-adequacy framework let the twins fly straight. 5 28

The civilian exemplar makes the inversion concrete — and it must be kept attributed correctly. The cautionary aircraft of Section 4 was military; the cost-savings exemplar is civilian Air New Zealand. On 1 December 2015, Air New Zealand operated the aviation industry’s first-ever scheduled ETOPS-330 service — Auckland to Buenos Aires, a Boeing 777-200ER powered by Rolls-Royce Trent 800 engines, a sector of roughly twelve hours — having flown ETOPS-240 on the type from October 2014 and received 330-minute approval in November 2015. 6 27 The route was viable precisely because the diversion-adequacy framework let the twin fly direct across the Southern Ocean, where adequate alternates are sparse, instead of doglegging to stay within a shorter diversion envelope. The discipline of always-a-runway is what unlocked the range; it did not cap it.

Adequate-alternate doctrine did not make twins fly less. It made them flyable farther, accountably — the fallback rule was the runway that let the twin fly the direct route.

The payoff

The mechanism of the saving is worth making explicit, because the same mechanism drives the AI corollary. ETOPS does not save fuel by making engines more efficient; it saves fuel by removing a constraint on the trajectory. The dogleg is pure deadweight — miles flown not toward the destination but toward staying within reach of a runway — and the diversion-adequacy framework, by proving that an adequate alternate is always within the envelope along the direct path, deletes those miles. 3 5 The discipline pays for itself not by changing the airframe but by certifying the route. This is why the inversion is not a paradox: a stricter safe-harbour requirement, rigorously discharged, buys a less constrained trajectory, because the constraint that the doctrine removes — the conservative dogleg flown out of uncertainty about safe harbours — is far more expensive than the discipline of proving them.

The AI corollary follows without strain. Rigorous fallback discipline is the precondition for granting an agent wider autonomy, not a tax on it. You let the agent go farther — longer horizons, more consequential actions, less human gating — precisely because you can prove a safe harbour is always reachable in the worst-case envelope. The organization that refuses to specify rollback horizons and adequate fallback sets is the organization stuck flying doglegs: it either over-constrains its agents into uselessness or, worse, lets them fly direct without the safe harbours that would make direct flight responsible. The diversion-adequacy framework is what converts a capable airframe into a profitable route, and fallback rigor is what converts a capable agent into autonomy you can actually grant.

Section 09

The Posture: Pre-Compute the Point of Safe Return

The posture that falls out of the diversion-airport doctrine reduces to three commitments, each with an aviation precedent and an honest limit.

First, every extended autonomous operation pre-computes a point of safe return and refuses to pass it without a reachable adequate fallback in the worst-case envelope. 1 14 This is the dispatcher’s discipline made into an architectural primitive: the rollback horizon is identified before the operation runs, and crossing it requires an affirmative check that a safe harbour is in the OEI-equivalent envelope, not a default assumption that one is.

Second, the fallback set is specified against the worst-case arrival state, not the happy path, and checked for correlated foreclosure. 4 The alternate that matters is the one that can receive the degraded arrival, and a set whose members all fail in the same condition is one member deep regardless of its count. In agent terms: verify that each fallback catches the failure state, that no fallback depends on the subsystem most likely to have failed, and that a human in the loop is actually reachable when the loop needs one.

Third, forecast-versus-observation divergence is a first-class divert trigger, not noise to be smoothed over. 17 20 When the world the agent observes diverges from the world its plan predicted, the sound move is to divert early — roll back or escalate while reversibility still exists — rather than press on toward the irreversible step on the strength of the prediction. The bounded-risk implementation is conformal abstention: a calibrated threshold that holds the rate of committed-and-wrong actions below a chosen tolerance and diverts whenever confidence falls under it. 22

The honest limits deserve the same prominence as the commitments, and NZ7571 supplies most of them. The alternate set can be mis-specified — that was the actual failure at Pegasus, not the weather and not the crew. 12 No discipline that pre-computes a safe harbour is better than the adequacy judgement that populated the set, and that judgement is made under uncertainty about conditions at the expected time of use. “Reachable” depends on a worst-case-envelope estimate — the OEI diversion-time circle, or its agent equivalent — that can itself be wrong; an envelope computed too optimistically reintroduces the all-engines fallacy by the back door. And divert-on-uncertainty has a coverage cost: an agent that escalates or rolls back whenever its confidence dips will accomplish less than one that presses on, exactly as a dispatcher who demands a richer alternate set flies fewer marginal routes. The trade between coverage and safety is real, and it is a governance decision rather than a technical one. 20

There is a deeper limit that the NZ7571 inquiry illuminates and that the autonomous-systems field should sit with honestly: the doctrine controls the risks it has named, and the residual risk migrates to the naming. ETOPS makes the diversion-airport adequacy explicit, measurable, and auditable — and in doing so it pushes the failure surface upstream, to the quality of the risk assessment that populated the alternate set in the first place. 12 The crew at Pegasus performed; the dispatch process under-specified. An autonomy regime that pre-computes rollback horizons and audits fallback sets will, similarly, push its residual failures upstream to the analysts who decide what counts as the worst case and which fallbacks belong in the set. This is not a defect of the doctrine but its honest cost: it converts diffuse, in-the-moment failure into concentrated, nameable, fixable failure in the design layer — which is exactly the relocation that makes the residual auditable, but which also means the design layer must be held to the standard the doctrine implies. The discipline is only as good as the worst case it dares to plan against.

None of these limits is a reason to abandon the doctrine; they are the doctrine’s own statement of its scope, which is what makes it trustworthy. The closing line is the one ETOPS earned over four decades of practice and one the autonomous-systems field can adopt without waiting for its own four decades: the diversion rule did not make twins fly less — it made them flyable farther, accountably. Fallback discipline is not the cap on autonomy. It is the runway that lets you grant it.

End of paper↑ Back to top

Range Is Defined Relative to a Safe Harbour

Context

The Finding