Skip to main content
kellerai.blog

Latest

Every KellerAI paper, newest first.

40 papers • 38 releases

20Jun 2026
Engineering Discipline & VerificationAutonomy Is a Range You Earn

Aviation stopped asking whether a twin-engine jet could cross an ocean and started asking how far it had earned the right to fly. AI agents need the same envelope.

Agent autonomyOperational envelopesRisk-graded deployment
Read in-depth →
Engineering Discipline & VerificationAlways a Runway

The ETOPS rule is not "fly farther." It is "never fly past a reachable safe harbour" — and the same rule should govern every autonomous agent.

Agent autonomy & rollback horizonsETOPS diversion-airport doctrineFallback reachability & safe-harbour design
Read in-depth →
Engineering Discipline & VerificationReliability You Can Bank

Why a wider autonomy budget is something you earn from failure-rate data — the ETOPS lesson for AI agents.

AI agent autonomy & reliability accountingETOPS in-flight-shutdown-rate and Basel backtestingTail-aware governance and the abstention discipline
Read in-depth →
19Jun 2026
18Jun 2026
17Jun 2026
Regulation & ComplianceThe Supervisor's Mirror

The Federal Reserve published the model-risk inventory schema. A bank can just use it.

Model Risk ManagementAI GovernanceRegulatory Read-Across
Read in-depth →
16Jun 2026
Engineering Discipline & VerificationAviation and Banking Already Solved Hallucination

The AI field has been asking the wrong question. Aviation and banking each solved the underlying engineering problem decades ago — under regulatory compulsion, at enormous cost, and with a precision the AI industry has yet to borrow.

AI GovernanceVerificationHallucination
Read in-depth →
11Jun 2026
Regulation & ComplianceThe Audit Trail Nobody Acts On

Logging what your agentic system did is necessary. Wiring that log to a gate that can block a bad output is governance.

AI GovernanceObservabilityPolicy-as-CodeAgentic AI
Read in-depth →
10Jun 2026
AI EconomicsThe Fable 5 Token Economy

Token spend and usage limits are not two problems. They are one budget, governed by one rule, and a frontier-tier model makes every term in it bigger.

Token economy & usage limitsAgentic cost governanceFrontier model fan-out economics
Read in-depth →
9Jun 2026
Security & Supply ChainThe Application You Can Audit

What we send is what every party evaluates: the same artifact for the human reviewer and the machine that reads it first.

Prompt InjectionHiringIntegrityProvenance
Read in-depth →
8Jun 2026
Model Governance & UpgradesWhen the Model Changes Mid-Request

Fable 5 can hand your request to Opus 4.8 — and tells you so. The question is what your team does with a model change that happens inside a single request.

Safeguard fallback governanceAI model risk managementFrontier model disclosure
Read in-depth →
7Jun 2026
6Jun 2026
5Jun 2026
AI EconomicsWhen the Rate Card Has Tiers

Anthropic priced the frontier at exactly double the model you already run. Your bill is now a function of which tier you bought, what your workload triggers, and how many turns your agents burn.

Tier-stratified pricingAI cost governanceFrontier model routing economics
Read in-depth →
1Jun 2026
30May 2026
29May 2026
Regulation & ComplianceThe Audit Field Was Always Empty

On 2 August 2026 the EU AI Act stops being a paper and starts assigning penalties — to the same audit-trail gaps your engineering already has.

EU AI ActCompliance & GovernanceAudit Trail Design
Read in-depth →
28May 2026
27May 2026
AI EconomicsThe Token You Didn't Count

A model upgrade can keep the same rate card while delivering up to 35% more billing, with no pre-migration signal.

AI Pricing & EconomicsCost ObservabilityTokenizer Impact
Read in-depth →
26May 2026
25May 2026
24May 2026
23May 2026
22May 2026
Model Governance & UpgradesWhen a "Modest" Model Release Isn't

A vendor can call a release modest and be right about the model — and you can still be wrong to treat it as a drop-in.

Model UpgradesBlast Radius GovernanceValidation & Benchmarking
Read in-depth →
21May 2026
20May 2026
19May 2026
18May 2026
17May 2026
Observability & DriftObservability Theater

Empty telemetry fields train operators to trust signals that carry no information.

Observability & TelemetryCompliance Audit TrailsData Quality
Read in-depth →
16May 2026
15May 2026
Code Quality & ArchitectureFixpoints, Not Fixes

Resource-lifecycle bugs don't have a fix — they have a fixpoint.

Contract SpecificationResource LifecycleAsync Invariants
Read in-depth →
14May 2026
13May 2026
Earned Autonomy & AgentsAccess Engineering

The thesis that the highest-leverage AI work is upstream of the model.

AuthorizationAgent GovernanceAccess ControlWorkspace Scoping
Read in-depth →
12May 2026
Prompt & Artifact EngineeringSuper RAG v2.1

Retrieval-augmented generation beyond the naive pipeline.

Query DecompositionStructural RankingCitation GroundingContext Budget Management
Read in-depth →
11May 2026
Earned Autonomy & AgentsMulti-Agent Patterns

Coordination, trust, and failure isolation in agent networks.

Agent OrchestrationConcurrency ControlFailure IsolationSupervisor Patterns
Read in-depth →
10May 2026
9May 2026
Code Quality & ArchitectureThe Thinking Moat

Why machine-enforced reasoning chains are a durable competitive advantage.

Competitive MoatRuntime Reasoning EnforcementOrganizational Capital
Read in-depth →
8May 2026
Engineering Discipline & VerificationThe Audit You Can Audit

Why most codebase health checks change nothing — and what it takes to trust one.

Audit Design & TrustCodebase HealthObservable Process
Read in-depth →
7May 2026
6May 2026
Engineering Discipline & VerificationWhy Your AI Skills Need Evidence

An AI skill that demos well can still be quietly useless — because a stochastic model run once proves nothing.

AI Skill AuthoringMeasurement & EvidenceQuality Gates
Read in-depth →