27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Feb 17, 2026

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

I am twenty-three weeks into building DOS, six weeks into a refactoring sequence on the hook pipeline that I expect will run another six. This is the Fowler walkthrough I am running in real time — what hurt at each step, what name from the catalog I gave the pain, what refactoring removed it, and what I am committing to preserve through every move. Three refactorings shipped. Three on the test list, with the smells named in advance.

A confession: most refactoring blog posts I have read are dishonest by construction.

They show a single before-state and a single after-state, with the named refactoring as a clean transition between them. The reality is messier. A real codebase goes through many refactorings on the same code, in sequence, over months. Each one removes a specific smell. Each one introduces (or reveals) the next smell. The literature treats refactorings as discrete moves; the actual practice treats them as a continuous walk through Fowler's catalog as the codebase teaches you what hurts.

This post is that walk for one specific surface in DOS — the hook pipeline that runs at session start. I am six weeks into a refactoring sequence I expect will run another six. Three refactorings have shipped. Three more are on the test list, with the smells named in advance and the named moves drafted but not yet applied.

I am writing the walkthrough now, in the middle of the sequence, because the parts of refactoring practice that almost never appear in blog posts are the ones I want on the record: the moves I am not yet doing, the smells I have decided to leave alone, and the ordering decision (refactoring 4 should come before 5, not after) that is currently the load-bearing call I am stewarding.

Each refactoring has a Fowler name. Each is triggered by a specific smell. Each preserves a specific invariant. The point of writing it as a sequence — rather than as a single before/after — is that the order of the refactorings matters. Doing them in the wrong order, or skipping the smaller ones, would make the later ones impossible.

What this post is not

This is not a "how to refactor" tutorial. The mechanics of each refactoring are well covered in Fowler's Refactoring (1999/2018). This post is a case study in motion — what the catalog looks like applied in sequence to one real codebase, with the smells that triggered each move and the trade-offs I am stewarding through the rest of the sequence.

The Fowler frame: smell → named refactoring → preserved invariant

Martin Fowler's Refactoring is structured as a catalog. Each entry has a consistent shape: a smell (what hurts), a named refactoring (what to do about it), the mechanics (the step-by-step move), and the trade-offs (what gets harder).

The discipline I wish someone had taught me earlier: when refactoring, name the smell first. Not "this is messy." Not "this is hard to read." A specific smell from the catalog. "Long Method." "Duplicated Code." "Feature Envy." The naming forces precision about what is actually wrong, which constrains what fix is appropriate.

Without the smell name, I have spent embarrassing hours doing pseudo-refactorings — reorganizing code to make it feel better without measurably reducing any specific pain. With the smell name, the refactoring move is constrained, and the question of "did this work" has a concrete answer.

The Uncle Bob frame: SOLID as the success criteria

Robert Martin's SOLID principles do not tell you which refactoring to apply. They tell you whether the result is structurally sound. Each refactoring should leave the code closer to one or more SOLID principles, never further from any of them.

The hook pipeline's six-month evolution is moving it from violating most of SOLID to satisfying most of SOLID. Each refactoring takes it one step closer.

SOLID principle	Hook pipeline at month 0	Hook pipeline target
Single Responsibility	One file did everything	Each loader has one responsibility
Open/Closed	Adding a loader meant editing the hook	Adding a loader is a new file + one registry line
Liskov Substitution	N/A (no abstraction)	All loaders implement the same `Loader` interface
Interface Segregation	N/A (no interface)	Loader interface has 4 methods (`name`, `slot`, `phase`, `load`)
Dependency Inversion	Hook depended on every loader by name	Hook depends only on `Loader` and the registry

Three columns, except the right one is half aspirational. The "Single Responsibility" and "Open/Closed" rows are real today, after refactorings 1 through 3. The "Liskov", "Interface Segregation", and "Dependency Inversion" rows are what refactorings 4 through 6 are supposed to deliver. I am tracking the table publicly in the codebase's CLAUDE.md so the agent — and any future contributor — has the migration target visible.

The three refactorings I have shipped

What follows is the actual chronological sequence of what has landed. Each entry: the date, the smell that triggered it, the Fowler name, what it preserved, and what it set up for the next move.

Refactorings shipped, in order

Six weeks ago — Extract Function. Smell: Long Method. The session-start hook was 220 lines in one function. Pulled out loadStartupFiles(), loadRelationship(), loadLearning(), etc. as separate functions. Hook still calls them inline. Preserved: identical banner output. Set up: the next refactoring became visible because the inline structure was now legible.
Four weeks ago — Replace Inline Code with Function Call. Smell: Duplicated Code. Each loader call had its own try/catch wrapper, all with the same structure. Extracted safeLoad(loader, ctx) — a single five-line function that wrapped the try/catch. Replaced seven inline try/catch blocks with seven safeLoad(...) calls. Preserved: identical error-handling behavior. Set up: the loaders no longer need to know about error handling, which makes them genuinely substitutable.
Two weeks ago — Introduce Parameter Object. Smell: Long Parameter List. Loaders were taking 5-7 parameters each (project ID, user ID, session ID, hook context, fragment registry, ...). Bundled them into a single HookContext object. Loaders now take one parameter. Preserved: every loader still gets every parameter it needs. Set up: refactoring 4 (interface extraction) needs a single argument shape to be coherent.

The total duration of the first three: six weeks calendar time. The total time invested in those refactorings (separate from feature work): roughly eleven hours. The total LOC change so far: from 380 lines to about 470 (more files, smaller files). The behavior change visible to the operator: zero.

The three refactorings on the test list

This is the part of the sequence I rarely see in blog posts — the ones I have not yet done, with the smells and the moves named in advance.

Refactorings on the test list, in expected order

Next sprint — Extract Interface. Smell: Implicit Contract. Every loader is currently a function with no declared shape. Plan: extract a Loader TypeScript interface with name, slot, phase, and load properties. All seven loaders implement the same interface. Will preserve: function-level behavior of every loader. Will set up: the registry pattern (next move) becomes natural — you can put Loader instances in an array because they have a common interface.
Sprint after that — Replace Conditional Dispatch with Plug-in. Smell: Switch Statement / Hidden Coupling. The hook still has a long sequence of "load X, then load Y, then load Z" with implicit ordering rules ("load relationship first because the others read from its cache"). Plan: replace with a LOADERS array (the registry) plus a phase: 'pre' | 'post' declaration on each loader. The hook iterates the registry instead of having hardcoded sequence. Will preserve: the actual load order (registry entries kept in the same order). Will set up: the next refactoring (slot composition) becomes possible because load order is now data, not code.
Two sprints out — Decouple Order with FragmentSlot. Smell: Temporal Coupling. Load order and compose order are the same — loaders run in the order their output appears in the banner. This is wrong: the operator's most-relevant context (active project) should appear first in the banner, but the project loader needs to run after the relationship loader (data dependency). Plan: introduce FragmentSlot enum and a separate composer that assembles banner sections in slot order. Will preserve: identical banner output. Will set up: nothing yet — this is the planned end of the sequence.

The estimated duration of the next three: another six weeks calendar time, roughly eleven more hours of focused work. The estimated final LOC: from 470 to around 600. The estimated operator-visible change: zero.

I am committing the sequence to the page now because if I lose conviction in the middle of the sequence — under a deadline, or after a tired week — the sketch is what brings me back. Every refactoring I have ever skipped past was one I did not write down in advance.

The three smells I have decided NOT to refactor

Naming the refactorings I am choosing not to do is more useful than naming the ones I do. Three smells in the current code that I have inspected and decided to leave alone:

The discipline of not refactoring is harder than the discipline of refactoring. Every smell looks like a chance to make the code better. The actual question is whether the refactoring is worth its cost — measured in time, in test churn, in cognitive load on future readers.

What every refactoring has preserved (and will preserve)

The most important property of the six-month sequence is what it does not change.

The three preserved invariants

Across all three refactorings shipped so far, three properties of the system have stayed stable:

The banner output — operators see an identical structured banner before and after every refactoring.
The startup latency — total session-start time stays within ±10ms across each refactoring.
The error-handling behavior — a single loader failing produces the same warning and the same continued banner before and after.

Refactoring is by definition a change to internal structure that preserves external behavior. The discipline of measuring the preserved invariants — actually running the banner and confirming it is identical, actually timing the startup and confirming the latency is unchanged — is what separates refactoring from "I made some changes and hopefully nothing broke." I am committing in this essay to keeping those three invariants stable across refactorings 4 through 6 as well, with the same measured-before-and-after discipline.

What I would do differently if starting over

Three things, ordered by how much they would have saved me. Two are corrections to refactorings I already shipped; one is to a refactoring I have not done yet.

One. I would have written a characterization test in week 0. The hook pipeline did not have one for the first three weeks. I was refactoring against my memory of what it should output, which was occasionally wrong. The cost of writing the characterization test in week 0 would have been 30 minutes; the cost of not having it caused at least one regression that took an afternoon to track down. The eval-suite essay I published last week is partly the response to this: the next refactoring sequence I run on a different surface gets the characterization test before the first move.

Two. I should have done refactoring 4 (Extract Interface) earlier in the sequence. I have been delaying it because TypeScript interfaces felt like ceremony for a few loaders. They will make refactorings 5 and 6 dramatically easier; I should have done it as soon as the third loader was added. The lesson I am applying to the current refactoring 4: do not delay it again. Ship it next sprint.

Three. Refactoring 3 (Introduce Parameter Object), which I shipped two weeks ago, was correct but possibly premature. It was applied at five loaders, all of which used a similar but not identical subset of the parameters. It might have been better to wait until six or seven loaders when the actual common shape was clearer. The early version of HookContext has four fields nobody currently uses. I will know whether this was the right call after refactorings 4 and 5 land — if the unused fields turn out to be exactly what the new abstraction needs, the early move was correct; if they remain unused, I jumped a sprint too early.

Name	Type	Required	Default	Description
refactorings_shipped	integer	yes	3	Distinct named refactorings applied to the hook pipeline so far.
refactorings_pending	integer	yes	3	Refactorings on the test list, with smells and named moves committed to in advance.
loc_change_so_far	integer	yes	+90	Net LOC change after three refactorings. Refactorings often grow LOC by adding files; the gain is in cognitive load, not byte count.
behavior_changes	integer	yes	0	Operator-visible behavior changes. By definition zero — refactoring preserves external behavior.
average_refactoring_duration	hours	yes	3.7	Average time per named refactoring, projected across the full sequence.
trigger	smell name	yes	varies	Each refactoring is triggered by a specific named smell from Fowler's catalog. Naming the smell constrains the appropriate move.
preserved_invariants	banner output, latency, error behavior	yes	all three	Properties that remain identical across every refactoring. Measured before and after each.

Refactoring 5 sketched

The plugin moment, walked end-to-end before it ships.

Same discipline

Mechanizing invariants across boundaries.

The decomposition discipline

ISCs as test list — same Beck/Fowler lineage.

What keeps refactoring safe

Characterization tests for behaviors.

The honest summary of refactoring as a craft, written from the middle of a sequence: it is not the dramatic rewrite. It is the patient, sequential, named improvement of an existing thing, where each move removes a specific pain and preserves everything else. Six months total — half done. Six refactorings — three shipped. Six smells named — three more queued. Three invariants held. Zero operator-visible regressions.

That is what the catalog is for. It gives you names for what hurts, so you can apply the right move, so you can preserve the right invariants, so the operator never has to know the inside of the hook pipeline got better — they just notice that adding the eighth loader takes 15 minutes instead of 50.

I will take the patient walk over the dramatic rewrite, every time. The retrospective on the back half of the sequence ships in roughly six weeks, on a Tuesday, when refactoring 6 has been live long enough to know whether the slot-composer pulled its weight.

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat