Skip to content
DuranteDurante
ALL SYSTEMSGet Access

27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

W16 of 27

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

I do not have an eval suite for DOS yet. What I have, twenty-two weeks in, is the smell — every time I change a prompt, I find out whether I broke something the way TDD-less developers found out in 1998: by waiting for the next bug report. I have lived through this pattern three times in the past year. I am writing the design essay for the eval suite before I build it. Kent Beck on TDD's translated primitives for non-deterministic behavior. Michael Feathers on characterization tests for legacy prompts. Both applied to the moment an agent's behavior is supposed to be improving and currently has no measurement at all.

W13 of 27

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

I have not built the failure pipeline yet. What I have, nineteen weeks in, is the smell — every Monday I correct the agent on the same kind of mistake I corrected the prior Monday, and the correction has no half-life past the current context window. This is the discipline I want to commit to before the next time I am tired enough to give up on it. Kent Beck on empirical software design and failure as data. Andy Hunt and Dave Thomas on Broken Windows and Boiled Frog. Both applied to an agent that is supposed to be getting better and currently is not.

W10 of 27

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

I do not have a Council orchestrator. I have something cruder — a habit of channelling two or three named specialists by hand whenever an architectural decision smells like it has more than one defensible answer. Sixteen weeks in, the habit has paid back enough times that I want to commit to its shape before the next time I'm tired and ship the wrong primitive. This is the design essay. Fowler, Kent Beck, and Uncle Bob disagree productively at the bottom — and that disagreement is the part the pattern is supposed to surface.

W04 of 27

The Decomposition Discipline I Am Trying to Codify Inside DOS

Vague requests produce vague work. Two months into building DOS I have started enforcing a discipline I am calling, for now, 'the Algorithm' — every request decomposed into a numbered list of falsifiable criteria before any tool call. This is the design essay for that discipline, written ten weeks in, before the practice is formal enough to call a framework.

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch