27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Weekly DispatchFeb 20, 2026W17 of 27

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

On Tuesday Anthropic shipped Claude Sonnet 4.6 with roughly 70% coding preference vs 4.5 — and put MCP connectors, Skills, and compaction on the free tier alongside it. Thursday brought Google's Gemini 3.1 Pro at ARC-AGI-2 77.1%. Friday brought Anthropic's Claude Code Security in limited preview, with Opus 4.6 finding 500+ previously undetected bugs in production OSS. xAI's Grok 4.20 Beta landed mid-week with weekly weight refreshes and four-agent collaboration. The week was not about raw IQ — it was about agentic primitives moving from paywalled to free, vertical reasoners eating their first SAST category, and the indie tooling moat narrowing in real time.

Feb 17, 2026W17 of 27

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

I am twenty-three weeks into building DOS, six weeks into a refactoring sequence on the hook pipeline that I expect will run another six. This is the Fowler walkthrough I am running in real time — what hurt at each step, what name from the catalog I gave the pain, what refactoring removed it, and what I am committing to preserve through every move. Three refactorings shipped. Three on the test list, with the smells named in advance.

Weekly DispatchFeb 13, 2026W16 of 27

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

On Thursday Anthropic closed a $30B Series G at a $380B valuation — the second-largest private tech round in history. The same week, Zhipu shipped GLM-5 (744B-parameter MoE) trained entirely on Huawei Ascend silicon with zero Nvidia, and MiniMax dropped M2.5 + Lightning open weights within one SWE-bench point of Opus 4.6 at roughly one-twentieth the cost. Runway raised $315M Series E. OpenAI retired the GPT-4 family from ChatGPT. The bifurcation hardened: Western labs consolidating into a two-horse capital race; the open-weights frontier moving to China on domestic silicon. Coding benchmarks now within 1-3 points across all four poles. The moat is no longer the model.

Feb 10, 2026W16 of 27

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

I do not have an eval suite for DOS yet. What I have, twenty-two weeks in, is the smell — every time I change a prompt, I find out whether I broke something the way TDD-less developers found out in 1998: by waiting for the next bug report. I have lived through this pattern three times in the past year. I am writing the design essay for the eval suite before I build it. Kent Beck on TDD's translated primitives for non-deterministic behavior. Michael Feathers on characterization tests for legacy prompts. Both applied to the moment an agent's behavior is supposed to be improving and currently has no measurement at all.

Weekly DispatchFeb 6, 2026W15 of 27

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

On Thursday Anthropic and OpenAI shipped competing flagship coding models within hours of each other. Opus 4.6 introduced native Agent Teams — parallel sub-agents with mailbox IPC — inside Claude Code. GPT-5.3-Codex hit 'high' on cybersecurity in OpenAI's preparedness framework, triggering the first capability-gated rollout of an agentic-coding model. Snowflake bound itself to OpenAI for $200M on Monday. xAI shipped Grok Imagine 1.0 the same day. And Amazon's Q4 print landed Thursday with AWS up 24% and a $200B 2026 capex commitment. Frontier AI is no longer chat — it is parallel agents writing code against governed enterprise data on rented GPU capacity, and the flagship product line just collapsed into the same release cadence.

Feb 3, 2026W15 of 27

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

Studio has been live for twenty-five days. The first three weeks of the codebase had route handlers calling Prisma directly with provider calls inline. The migration to a Hexagonal architecture started this past Monday and is the load-bearing decision for everything Studio is going to do across the next twelve providers, three storage backends, and unknown number of auth surfaces. This is the design essay I am writing as the migration is underway, with Cockburn on the principle and Fowler on the strangler-fig path that gets there without a rewrite.

Weekly DispatchJan 30, 2026W14 of 27

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

Microsoft launched the Maia 200 inference chip on Monday and posted FY26 Q2 earnings on Wednesday — capex up 66% to $37.5B in a single quarter, Azure up 39%. The same Wednesday, Anthropic announced ServiceNow as the default Build Agent distribution channel for Claude. Meta guided 2026 capex to $115-135B. Ai2 released the SERA-32B open coding agents as the open-source counter-current, reproducible at $400. The five-day window is one signal: hyperscalers are owning the vertical stack — chip, model, distribution, capex — rather than buying its parts. Indie founders just got a much sharper picture of who controls what.

Jan 27, 2026W14 of 27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

Studio shipped eighteen days ago. The reference customer has not signed. Most go-to-market plans optimize for ten paying customers in the first quarter — mine optimizes for one, by July, with a year-long deep collaboration that produces the patterns, the case study, and the next ten customers as a side effect. This is the strategy I am committing to in public before I have the customer, because writing it down sharpens the hunt.

Weekly DispatchJan 23, 2026W13 of 27

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

On Thursday Anthropic published Claude's new constitution under CC0 — a values document, not a model release, explicitly forkable. On the same Davos stage, Amodei doubled down on 2026-2027 AGI while Hassabis countered with five-to-ten years and LeCun called LLMs a dead end. OpenAI confirmed its first hardware ships H2 2026. Davos itself pivoted from AI hype to AI ROI with centers-of-excellence as the canonical deployment pattern. The House Foreign Affairs Committee advanced the AI OVERWATCH Act 42-2 to give Congress chip-export blocking authority. Three weeks ago everything was experimentation. This week the industry started writing the rules down — on its own terms, before the regulators do it for them.

Jan 20, 2026W13 of 27

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

I have not built the failure pipeline yet. What I have, nineteen weeks in, is the smell — every Monday I correct the agent on the same kind of mistake I corrected the prior Monday, and the correction has no half-life past the current context window. This is the discipline I want to commit to before the next time I am tired enough to give up on it. Kent Beck on empirical software design and failure as data. Andy Hunt and Dave Thomas on Broken Windows and Boiled Frog. Both applied to an agent that is supposed to be getting better and currently is not.

Weekly DispatchJan 16, 2026W12 of 27

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

Anthropic launched Claude Cowork on Tuesday — Claude Code for non-technical knowledge work — and reorganized Mike Krieger from CPO into a new Labs incubator alongside Ben Mann. The same week shipped Claude for Healthcare, the DeepSeek Engram paper foreshadowing V4, Zhipu open-sourcing GLM-Image trained end-to-end on Huawei Ascend silicon, and Cursor's CLI shipping Plan/Ask agent modes plus one-click MCP auth. The agent surface is splitting along two axes — Western labs climb vertically into bounded application contexts, Chinese labs go down to architecture and silicon — while the harness layer in the middle hardens around MCP. Indie founders just got a much clearer map.

Jan 13, 2026W12 of 27

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

I started Builder's Compass in late 2023 because I was tired of giving the same architectural advice in five different DMs every week. The list is now ~3,800 deeply-engaged software architects, founders, and senior engineers, the cadence has held for the better part of two years and a quarter, and the slow growth has taught me more than the spikes did. Here is what worked, what did not, and the one piece of advice I keep giving away to founders asking whether to write publicly.

Weekly DispatchJan 9, 2026W11 of 27

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

On Wednesday Microsoft turned Claude on by default inside 365 Copilot for most commercial tenants. The same day, Anthropic's term sheet at a $350B valuation leaked. NVIDIA spent Monday at CES reframing 2026 as 'the ChatGPT moment for physical AI' with the Rubin platform, open VLA models, and humanoid-robot weights. On Thursday Cursor's CLI shipped MCP, rules, and models as first-class slash commands — chasing Claude Code's terminal turf. The four-day window is one signal: distribution defaults beat product depth, and the substrate is being pre-wired into the surfaces operators already use.

Jan 6, 2026W11 of 27

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

Studio launches in three days. Altyaa is the wedge it is pointed at — AI social media for Brazilian neighborhood businesses, in Portuguese, posting from a phone in under two minutes. Two products, one founder, one shared substrate. This is the positioning essay I am writing in the week before the gateway ships, because the moment it ships I will stop being able to think about Altyaa as cleanly as I can right now.

Weekly DispatchJan 2, 2026W10 of 27

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

The four-day window between Christmas and New Year is the year's quietest news cycle. The labs took PTO. The practitioners wrote. Simon Willison's annual '2025 in LLMs' landed New Year's Eve at 11:50pm. Timothy B. Lee published seventeen calibrated predictions for 2026. Willison shipped two indie tools and a careful piece on SQLite's contribution policy. The only AI lab that shipped novel research in the window was DeepSeek — a Dec 31 architecture paper that nobody outside the holiday-skeleton-crew read. The week is one signal: Chinese labs do not observe Western holidays, the consensus on Chinese open-weights catching up is now table stakes, and the indie-builder loop never actually pauses.

Dec 30, 2025W10 of 27

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

I do not have a Council orchestrator. I have something cruder — a habit of channelling two or three named specialists by hand whenever an architectural decision smells like it has more than one defensible answer. Sixteen weeks in, the habit has paid back enough times that I want to commit to its shape before the next time I'm tired and ship the wrong primitive. This is the design essay. Fowler, Kent Beck, and Uncle Bob disagree productively at the bottom — and that disagreement is the part the pattern is supposed to surface.

Weekly DispatchDec 26, 2025W09 of 27

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

I expected Christmas week to be a four-day lull. It wasn't. Z.ai shipped GLM-4.7, the first open-weights model claiming parity with Sonnet 4.5 on SWE-bench. MiniMax shipped M2.1. Qwen shipped Image-Edit-2511. Nvidia paid $20B for Groq's inference licensing — the largest deal in Nvidia's history. Anthropic doubled Claude Pro/Max usage limits for the holiday. OpenAI extended Codex limits and shipped a personality-tuned GPT-5.2-Codex-XMas. The week is one signal: inference is the strategic prize and open weights are the cheap escape hatch. Both halves of the message matter for indie founders.

Dec 23, 2025W09 of 27

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

I have not built the memory layer yet. What I have, fifteen weeks into DOS, is the smell — every session opens with the agent reading the same files it read yesterday because nothing accumulates between turns. This essay is the architecture I was designing in December 2025, before the upstream project I would eventually fork even existed. Eric Evans on partitioning the graph into Aggregates. Greg Young on treating every fact as an event. Both applied to the moment an operator's accumulated context has nowhere to go.

Weekly DispatchDec 19, 2025W08 of 27

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

On Thursday Anthropic opened Agent Skills as an open standard with a partner directory — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, and Zapier joined day one — completing the protocol-commoditization arc that started last week with MCP and AGENTS.md moving under Linux Foundation governance. Underneath, four days of frontier sprinting: GPT-5.2-Codex on Thursday, Gemini 3 Flash on Wednesday, NVIDIA's open-weights Nemotron 3 family on Monday, and OpenAI fast-tracking GPT Image 1.5 mid-week. The pattern is clear: the spec is becoming a public good while capability is becoming the only moat.

Dec 16, 2025W08 of 27

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

I have not built Sentinel yet. I am writing this before I do, because if I do not commit the design to the page now, I will discover the wrong shape of it the hard way once the agent has started inducting bad patterns across eleven projects. This is the architecture I want to commit to — Eric Evans's Ubiquitous Language and Alistair Cockburn's information radiator, applied to the moment a new AI agent meets an unfamiliar codebase.

Previous Next

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat