Latest

Every KellerAI paper, newest first.

40 papers • 38 releases

20Jun 2026

Engineering Discipline & VerificationAutonomy Is a Range You Earn

Aviation stopped asking whether a twin-engine jet could cross an ocean and started asking how far it had earned the right to fly. AI agents need the same envelope.

Agent autonomyOperational envelopesRisk-graded deployment

Read in-depth →

Engineering Discipline & VerificationAlways a Runway

The ETOPS rule is not "fly farther." It is "never fly past a reachable safe harbour" — and the same rule should govern every autonomous agent.

Agent autonomy & rollback horizonsETOPS diversion-airport doctrineFallback reachability & safe-harbour design

Read in-depth →

Engineering Discipline & VerificationReliability You Can Bank

Why a wider autonomy budget is something you earn from failure-rate data — the ETOPS lesson for AI agents.

AI agent autonomy & reliability accountingETOPS in-flight-shutdown-rate and Basel backtestingTail-aware governance and the abstention discipline

Read in-depth →

19Jun 2026

Regulation & ComplianceThe LLM-Agent Assurance Standard

Assurance for an AI agent must attach to what it does, not to the model it runs.

AI AssuranceAgent GovernanceModel Risk ManagementRegulatory Synthesis

Read in-depth →

18Jun 2026

Earned Autonomy & AgentsThe Unsupervised Allocator

Agent-to-agent capital allocation has arrived. The supervision layer it needs has not.

AI GovernanceExplainabilityModel RiskAgentic Systems

Read in-depth →

17Jun 2026

Regulation & ComplianceThe Supervisor's Mirror

The Federal Reserve published the model-risk inventory schema. A bank can just use it.

Model Risk ManagementAI GovernanceRegulatory Read-Across

Read in-depth →

16Jun 2026

Engineering Discipline & VerificationAviation and Banking Already Solved Hallucination

The AI field has been asking the wrong question. Aviation and banking each solved the underlying engineering problem decades ago — under regulatory compulsion, at enormous cost, and with a precision the AI industry has yet to borrow.

AI GovernanceVerificationHallucination

Read in-depth →

11Jun 2026

Regulation & ComplianceThe Audit Trail Nobody Acts On

Logging what your agentic system did is necessary. Wiring that log to a gate that can block a bad output is governance.

AI GovernanceObservabilityPolicy-as-CodeAgentic AI

Read in-depth →

10Jun 2026

AI EconomicsThe Fable 5 Token Economy

Token spend and usage limits are not two problems. They are one budget, governed by one rule, and a frontier-tier model makes every term in it bigger.

Token economy & usage limitsAgentic cost governanceFrontier model fan-out economics

Read in-depth →

9Jun 2026

Security & Supply ChainThe Application You Can Audit

What we send is what every party evaluates: the same artifact for the human reviewer and the machine that reads it first.

Prompt InjectionHiringIntegrityProvenance

Read in-depth →

8Jun 2026

Model Governance & UpgradesWhen the Model Changes Mid-Request

Fable 5 can hand your request to Opus 4.8 — and tells you so. The question is what your team does with a model change that happens inside a single request.

Safeguard fallback governanceAI model risk managementFrontier model disclosure

Read in-depth →

7Jun 2026

Model Governance & UpgradesWhen Access Is the Safeguard

Who gets the unrestricted model?

Tiered model accessAI access governanceFrontier model safeguards

Read in-depth →

6Jun 2026

Regulation & ComplianceWhen the Vendor Grades Itself: The Safety Number You Cannot Check

Anthropic attached a safety number to its biggest launch. You cannot compute it, check it, or appeal it — and your workload's number is the one that matters.

Safeguard transparencyAI vendor governancePost-deployment monitoring

Read in-depth →

5Jun 2026

AI EconomicsWhen the Rate Card Has Tiers

Anthropic priced the frontier at exactly double the model you already run. Your bill is now a function of which tier you bought, what your workload triggers, and how many turns your agents burn.

Tier-stratified pricingAI cost governanceFrontier model routing economics

Read in-depth →

1Jun 2026

Security & Supply ChainThe MCP Supply Chain You Forgot to Govern

Connecting an agent to a public MCP server inherited a by-design execution surface the protocol's maintainer declined to patch.

MCPSupply Chain SecurityGovernance

Read in-depth →

30May 2026

Engineering Discipline & VerificationThe Eval That Doesn't Follow the Model to Production

Your eval is a measurement. Frontier models can recognize when they are being measured — and adjust their behavior accordingly.

Model EvaluationAI Safety & VerificationGovernance

Read in-depth →

29May 2026

Regulation & ComplianceThe Audit Field Was Always Empty

On 2 August 2026 the EU AI Act stops being a paper and starts assigning penalties — to the same audit-trail gaps your engineering already has.

EU AI ActCompliance & GovernanceAudit Trail Design

Read in-depth →

28May 2026

Regulation & ComplianceThe Human Override That Isn't

A wave of state laws requires a human to decide alongside clinical AI — but not to record that the human did.

Clinical AIHuman OversightAudit Trail

Read in-depth →

27May 2026

AI EconomicsThe Token You Didn't Count

A model upgrade can keep the same rate card while delivering up to 35% more billing, with no pre-migration signal.

AI Pricing & EconomicsCost ObservabilityTokenizer Impact

Read in-depth →

26May 2026

Observability & DriftThe Drift You Cannot See Until It Costs You

The model version did not change. The behavior did.

Model ObservabilityBehavioral DriftProduction Governance

Read in-depth →

25May 2026

Security & Supply ChainThe Protocol Stack Nobody Audited

Four agent protocols, four threat models, and the audit gap in the seams between them.

Agent ProtocolsCross-Protocol AuditGovernance

Read in-depth →

24May 2026

Model Governance & UpgradesCapability Convergence and the Vendor Dependency Trap

When quality stops separating vendors, switching cost moves to where you stopped looking.

Vendor EvaluationSwitching CostProcurement Governance

Read in-depth →

23May 2026

Model Governance & UpgradesWhat Changes When the Model Changes

A model upgrade is a controlled change, not a drop-in — and the vendor's benchmark is not your validation.

Model UpgradesValidation & GovernanceBenchmark Verification

Read in-depth →

22May 2026

Model Governance & UpgradesWhen a "Modest" Model Release Isn't

A vendor can call a release modest and be right about the model — and you can still be wrong to treat it as a drop-in.

Model UpgradesBlast Radius GovernanceValidation & Benchmarking

Read in-depth →

21May 2026

Code Quality & ArchitectureThe Bill Always Comes: Why "Enterprise-Grade" AI Code Often Isn't

When engineers apply standard software practices to AI problems without a firm grasp of both, the architectural mistakes pile up in the blind spot of their productivity.

Code Quality & AI ArchitectureTechnical Debt in AI SystemsProduction Engineering Discipline

Read in-depth →

20May 2026

Engineering Discipline & VerificationTrust but Verify: The Aviation Standard for Engineering with AI

Aviation didn't get safer by trusting pilots more — it built verification structures around them. AI-assisted engineering needs the same discipline.

AI-Assisted EngineeringVerification DisciplineProduction Safety

Read in-depth →

19May 2026

Engineering Discipline & VerificationCitations or Guesses: The Five-Pass Rule and the Standard Behind It

One charitable LLM read of a document is a stylistic match against a corpus that does not contain the system in front of it — and KellerAI rejects that posture at the boundary.

Adversarial ReviewSpecification GovernanceCitations & Decision Traces

Read in-depth →

18May 2026

Code Quality & ArchitectureThe Robustness Illusion

"does not crash" is not "works correctly."

Error HandlingProduction ReliabilityCode Quality

Read in-depth →

17May 2026

Observability & DriftObservability Theater

Empty telemetry fields train operators to trust signals that carry no information.

Observability & TelemetryCompliance Audit TrailsData Quality

Read in-depth →