Skip to content
DuranteDurante
ALL SYSTEMSGet Access

27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Weekly DispatchW17 of 27

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

On Tuesday Anthropic shipped Claude Sonnet 4.6 with roughly 70% coding preference vs 4.5 — and put MCP connectors, Skills, and compaction on the free tier alongside it. Thursday brought Google's Gemini 3.1 Pro at ARC-AGI-2 77.1%. Friday brought Anthropic's Claude Code Security in limited preview, with Opus 4.6 finding 500+ previously undetected bugs in production OSS. xAI's Grok 4.20 Beta landed mid-week with weekly weight refreshes and four-agent collaboration. The week was not about raw IQ — it was about agentic primitives moving from paywalled to free, vertical reasoners eating their first SAST category, and the indie tooling moat narrowing in real time.

W17 of 27

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

I am twenty-three weeks into building DOS, six weeks into a refactoring sequence on the hook pipeline that I expect will run another six. This is the Fowler walkthrough I am running in real time — what hurt at each step, what name from the catalog I gave the pain, what refactoring removed it, and what I am committing to preserve through every move. Three refactorings shipped. Three on the test list, with the smells named in advance.

Weekly DispatchW16 of 27

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

On Thursday Anthropic closed a $30B Series G at a $380B valuation — the second-largest private tech round in history. The same week, Zhipu shipped GLM-5 (744B-parameter MoE) trained entirely on Huawei Ascend silicon with zero Nvidia, and MiniMax dropped M2.5 + Lightning open weights within one SWE-bench point of Opus 4.6 at roughly one-twentieth the cost. Runway raised $315M Series E. OpenAI retired the GPT-4 family from ChatGPT. The bifurcation hardened: Western labs consolidating into a two-horse capital race; the open-weights frontier moving to China on domestic silicon. Coding benchmarks now within 1-3 points across all four poles. The moat is no longer the model.

W16 of 27

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

I do not have an eval suite for DOS yet. What I have, twenty-two weeks in, is the smell — every time I change a prompt, I find out whether I broke something the way TDD-less developers found out in 1998: by waiting for the next bug report. I have lived through this pattern three times in the past year. I am writing the design essay for the eval suite before I build it. Kent Beck on TDD's translated primitives for non-deterministic behavior. Michael Feathers on characterization tests for legacy prompts. Both applied to the moment an agent's behavior is supposed to be improving and currently has no measurement at all.

Weekly DispatchW15 of 27

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

On Thursday Anthropic and OpenAI shipped competing flagship coding models within hours of each other. Opus 4.6 introduced native Agent Teams — parallel sub-agents with mailbox IPC — inside Claude Code. GPT-5.3-Codex hit 'high' on cybersecurity in OpenAI's preparedness framework, triggering the first capability-gated rollout of an agentic-coding model. Snowflake bound itself to OpenAI for $200M on Monday. xAI shipped Grok Imagine 1.0 the same day. And Amazon's Q4 print landed Thursday with AWS up 24% and a $200B 2026 capex commitment. Frontier AI is no longer chat — it is parallel agents writing code against governed enterprise data on rented GPU capacity, and the flagship product line just collapsed into the same release cadence.

W15 of 27

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

Studio has been live for twenty-five days. The first three weeks of the codebase had route handlers calling Prisma directly with provider calls inline. The migration to a Hexagonal architecture started this past Monday and is the load-bearing decision for everything Studio is going to do across the next twelve providers, three storage backends, and unknown number of auth surfaces. This is the design essay I am writing as the migration is underway, with Cockburn on the principle and Fowler on the strangler-fig path that gets there without a rewrite.

Weekly DispatchW14 of 27

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

Microsoft launched the Maia 200 inference chip on Monday and posted FY26 Q2 earnings on Wednesday — capex up 66% to $37.5B in a single quarter, Azure up 39%. The same Wednesday, Anthropic announced ServiceNow as the default Build Agent distribution channel for Claude. Meta guided 2026 capex to $115-135B. Ai2 released the SERA-32B open coding agents as the open-source counter-current, reproducible at $400. The five-day window is one signal: hyperscalers are owning the vertical stack — chip, model, distribution, capex — rather than buying its parts. Indie founders just got a much sharper picture of who controls what.

W14 of 27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

Studio shipped eighteen days ago. The reference customer has not signed. Most go-to-market plans optimize for ten paying customers in the first quarter — mine optimizes for one, by July, with a year-long deep collaboration that produces the patterns, the case study, and the next ten customers as a side effect. This is the strategy I am committing to in public before I have the customer, because writing it down sharpens the hunt.

Weekly DispatchW13 of 27

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

On Thursday Anthropic published Claude's new constitution under CC0 — a values document, not a model release, explicitly forkable. On the same Davos stage, Amodei doubled down on 2026-2027 AGI while Hassabis countered with five-to-ten years and LeCun called LLMs a dead end. OpenAI confirmed its first hardware ships H2 2026. Davos itself pivoted from AI hype to AI ROI with centers-of-excellence as the canonical deployment pattern. The House Foreign Affairs Committee advanced the AI OVERWATCH Act 42-2 to give Congress chip-export blocking authority. Three weeks ago everything was experimentation. This week the industry started writing the rules down — on its own terms, before the regulators do it for them.

W13 of 27

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

I have not built the failure pipeline yet. What I have, nineteen weeks in, is the smell — every Monday I correct the agent on the same kind of mistake I corrected the prior Monday, and the correction has no half-life past the current context window. This is the discipline I want to commit to before the next time I am tired enough to give up on it. Kent Beck on empirical software design and failure as data. Andy Hunt and Dave Thomas on Broken Windows and Boiled Frog. Both applied to an agent that is supposed to be getting better and currently is not.

Weekly DispatchW12 of 27

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

Anthropic launched Claude Cowork on Tuesday — Claude Code for non-technical knowledge work — and reorganized Mike Krieger from CPO into a new Labs incubator alongside Ben Mann. The same week shipped Claude for Healthcare, the DeepSeek Engram paper foreshadowing V4, Zhipu open-sourcing GLM-Image trained end-to-end on Huawei Ascend silicon, and Cursor's CLI shipping Plan/Ask agent modes plus one-click MCP auth. The agent surface is splitting along two axes — Western labs climb vertically into bounded application contexts, Chinese labs go down to architecture and silicon — while the harness layer in the middle hardens around MCP. Indie founders just got a much clearer map.

W12 of 27

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

I started Builder's Compass in late 2023 because I was tired of giving the same architectural advice in five different DMs every week. The list is now ~3,800 deeply-engaged software architects, founders, and senior engineers, the cadence has held for the better part of two years and a quarter, and the slow growth has taught me more than the spikes did. Here is what worked, what did not, and the one piece of advice I keep giving away to founders asking whether to write publicly.

Weekly DispatchW11 of 27

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

On Wednesday Microsoft turned Claude on by default inside 365 Copilot for most commercial tenants. The same day, Anthropic's term sheet at a $350B valuation leaked. NVIDIA spent Monday at CES reframing 2026 as 'the ChatGPT moment for physical AI' with the Rubin platform, open VLA models, and humanoid-robot weights. On Thursday Cursor's CLI shipped MCP, rules, and models as first-class slash commands — chasing Claude Code's terminal turf. The four-day window is one signal: distribution defaults beat product depth, and the substrate is being pre-wired into the surfaces operators already use.

W11 of 27

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

Studio launches in three days. Altyaa is the wedge it is pointed at — AI social media for Brazilian neighborhood businesses, in Portuguese, posting from a phone in under two minutes. Two products, one founder, one shared substrate. This is the positioning essay I am writing in the week before the gateway ships, because the moment it ships I will stop being able to think about Altyaa as cleanly as I can right now.

Weekly DispatchW10 of 27

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

The four-day window between Christmas and New Year is the year's quietest news cycle. The labs took PTO. The practitioners wrote. Simon Willison's annual '2025 in LLMs' landed New Year's Eve at 11:50pm. Timothy B. Lee published seventeen calibrated predictions for 2026. Willison shipped two indie tools and a careful piece on SQLite's contribution policy. The only AI lab that shipped novel research in the window was DeepSeek — a Dec 31 architecture paper that nobody outside the holiday-skeleton-crew read. The week is one signal: Chinese labs do not observe Western holidays, the consensus on Chinese open-weights catching up is now table stakes, and the indie-builder loop never actually pauses.

W10 of 27

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

I do not have a Council orchestrator. I have something cruder — a habit of channelling two or three named specialists by hand whenever an architectural decision smells like it has more than one defensible answer. Sixteen weeks in, the habit has paid back enough times that I want to commit to its shape before the next time I'm tired and ship the wrong primitive. This is the design essay. Fowler, Kent Beck, and Uncle Bob disagree productively at the bottom — and that disagreement is the part the pattern is supposed to surface.

Weekly DispatchW09 of 27

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

I expected Christmas week to be a four-day lull. It wasn't. Z.ai shipped GLM-4.7, the first open-weights model claiming parity with Sonnet 4.5 on SWE-bench. MiniMax shipped M2.1. Qwen shipped Image-Edit-2511. Nvidia paid $20B for Groq's inference licensing — the largest deal in Nvidia's history. Anthropic doubled Claude Pro/Max usage limits for the holiday. OpenAI extended Codex limits and shipped a personality-tuned GPT-5.2-Codex-XMas. The week is one signal: inference is the strategic prize and open weights are the cheap escape hatch. Both halves of the message matter for indie founders.

W09 of 27

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

I have not built the memory layer yet. What I have, fifteen weeks into DOS, is the smell — every session opens with the agent reading the same files it read yesterday because nothing accumulates between turns. This essay is the architecture I was designing in December 2025, before the upstream project I would eventually fork even existed. Eric Evans on partitioning the graph into Aggregates. Greg Young on treating every fact as an event. Both applied to the moment an operator's accumulated context has nowhere to go.

Weekly DispatchW08 of 27

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

On Thursday Anthropic opened Agent Skills as an open standard with a partner directory — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, and Zapier joined day one — completing the protocol-commoditization arc that started last week with MCP and AGENTS.md moving under Linux Foundation governance. Underneath, four days of frontier sprinting: GPT-5.2-Codex on Thursday, Gemini 3 Flash on Wednesday, NVIDIA's open-weights Nemotron 3 family on Monday, and OpenAI fast-tracking GPT Image 1.5 mid-week. The pattern is clear: the spec is becoming a public good while capability is becoming the only moat.

W08 of 27

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

I have not built Sentinel yet. I am writing this before I do, because if I do not commit the design to the page now, I will discover the wrong shape of it the hard way once the agent has started inducting bad patterns across eleven projects. This is the architecture I want to commit to — Eric Evans's Ubiquitous Language and Alistair Cockburn's information radiator, applied to the moment a new AI agent meets an unfamiliar codebase.

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch