27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Nov 14, 2025

The Week the Harness Ate the Model

On Tuesday Nov 11, OpenAI shipped GPT-5.1 to ChatGPT. Two days later, the API counterpart landed with apply_patch and shell tools, co-developed with Cursor, Cognition, Augment, Factory, and Warp. The same week, the MCP 2025-11-25 spec dropped its Tasks primitive. Anthropic stayed silent. The model is no longer the product the labs are competing on — the harness is, and they're admitting it.

If you only watched the consumer surface this week, you saw OpenAI ship a friendlier ChatGPT. If you watched the developer surface, you saw something more interesting: the labs have started competing on the harness, not the model.

On Tuesday November 11, OpenAI rolled out GPT-5.1 Instant and GPT-5.1 Thinking to ChatGPT — repositioning GPT-5 as "warmer, more conversational" with adaptive reasoning that adjusts compute based on question complexity (OpenAI announcement). On Thursday November 13, the API counterpart landed: GPT-5.1 for developers with an apply_patch tool, a shell tool, extended prompt caching, and explicit co-development credits to Cursor, Cognition, Augment, Factory, and Warp (OpenAI announcement, Simon Willison's blogmark).

Read those two announcements next to each other and the structural admission is impossible to miss: ChatGPT and the API are no longer one launch event. The API ships two days later with first-class agent primitives — apply_patch, shell — that were co-designed with the coding-agent ecosystem. OpenAI is publicly naming Cursor and Warp and friends as collaborators in the developer announcement.

That is a structural admission that the agent harness layer, not the chat product, is the real surface area.

The week's signal in one sentence

The model is no longer the product. The harness is the product. The labs that ship the best primitives for the harness layer will own developer mindshare. The labs that don't will lose it on the timescale of weeks, not quarters.

The other three things that mattered

MCP 2025-11-25 spec — Release Candidate published Tue Nov 11. Adds an experimental Tasks primitive (call-now, fetch-later async pattern), SDK tiering, and lets MCP servers run their own agentic loops on the client's tokens (MCP blog). The protocol every personal AI OS depends on just got async-first and started absorbing agent-loop responsibility from the host.
Simon Willison: "Six coding agents at once" — Tue Nov 11 (source). Documents upgrading 35 Datasette plugins in parallel using six concurrent coding agents. Important not because the technique is new but because a senior engineering voice is publicly normalizing the parallel-agent workflow that personal-AI-OS architectures are built to orchestrate.
Steve Krouse / Val Town on MCP shipping velocity — Wed Nov 12. "MCP servers ship faster than APIs because the LLM reads the spec at runtime" (clipped by Willison). Ratifies the "spec-as-runtime" mental model that an OS-shaped agent needs to stay swappable as capabilities evolve.
Anthropic — silent. No new shipped product Mon-Thu. Opus 4.5 didn't land for another 11 days. During a week when OpenAI shipped twice and MCP cut a major RC, the OS-incumbent of agentic tooling chose to stay quiet. That is its own data point about release-cadence asymmetry.

The pattern that emerged

This was an OpenAI offensive into territory Anthropic has owned since Skills shipped on October 16 — agentic-tool-use as a first-class primitive, with named ecosystem partners doing the building. The new GPT-5.1 API is, transparently, what Cursor and Warp told OpenAI they needed.

That signals two things to anyone building on top of frontier models:

What the labs are competing on

The 2024 competitive frame

Best raw model wins
Benchmarks are the scoreboard
Developer experience is downstream of model quality
API churn is a tax developers pay for access to the frontier

The 2025 competitive frame (visible this week)

Best agentic primitives win
Co-development credits to IDE vendors are the scoreboard
The harness layer (tool-use, async, structured outputs) is the product
API churn is the proof that the labs are listening to the agent-tooling ecosystem

If you read between the press releases, the developer-facing announcement was the more important of OpenAI's two launches this week. ChatGPT being friendlier matters for retention. apply_patch and shell shipping with co-development credits matters for which agent harness Cursor decides to optimize for next quarter. That is a much bigger lever.

What this changes for a personal AI OS

For DOS — the personal AI operating system I started writing nine weeks ago and just left consulting income to focus on full-time — the week confirmed a planning assumption that was previously a hope.

The assumption: the right architecture for a personal AI OS is substrate-portable. The agent's behavior (its identity, memory, capabilities) lives in a layer the operator owns. The model providing inference is a swappable piece below that layer. Today's substrate might be Claude Sonnet 4.5; next quarter's might be GPT-5.1; the quarter after that might be Kimi K2 Thinking running locally.

If the substrate is replaceable, then the harness is the moat. And this week the labs themselves admitted that.

The three things the harness layer must hold for substrate-portability to actually work

Identity — the operator's voice, taste, refusal patterns, constitutional rules. Lives in operator-owned files (CLAUDE.md, MEMORY/, hooks) that any compatible substrate can read at session start. Claude Code already does this; future substrates will need to follow the convention or expose equivalent.
Memory — the operator's accumulated facts, decisions, and context. I argued in Memory Replaces Lock-In that this is the load-bearing primitive. I am about to start building it. The design has to assume the substrate can read but not write the memory layer — substrate-side memory features (like OpenAI's memory feature) are useful but cannot be the source of truth.
Capability composition — the named tools, skills, and MCP servers the agent can invoke. MCP is the right abstraction here. Tuesday's RC moved MCP closer to being the canonical agent-capability protocol. That is good news for substrate-portability.

The week made all three of these planning assumptions stronger, not weaker. That matters more to me right now than the specific model release would have.

Two things I am watching for the next week

The honest position from where I sit

I am writing this from inside the second week of full-time work on DOS. The consulting income ended last week. The runway clock is running. The architecture I am committing to assumes the labs will keep competing on the harness layer instead of trying to absorb it.

This week made that assumption look more correct than it did seven days ago. That is the kind of small confirmation that, accumulated over months, eventually justifies a decision like the one I just made.

I will let you know if the pattern holds.

— Lucas

Sources verified the week of Nov 10-13, 2025: GPT-5.1 announcement (Nov 11) · GPT-5.1 for developers (Nov 13) · Simon Willison Nov 13 blogmarks · MCP 2025-11-25 spec RC anniversary post · Six coding agents at once (Nov 11) · Steve Krouse on MCP velocity (clip Nov 12) · Anthropic API release notes

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat