Skip to content
DuranteDurante
ALL SYSTEMSGet Access

27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

I am twenty-three weeks into building DOS, six weeks into a refactoring sequence on the hook pipeline that I expect will run another six. This is the Fowler walkthrough I am running in real time — what hurt at each step, what name from the catalog I gave the pain, what refactoring removed it, and what I am committing to preserve through every move. Three refactorings shipped. Three on the test list, with the smells named in advance.

A confession: most refactoring blog posts I have read are dishonest by construction.

They show a single before-state and a single after-state, with the named refactoring as a clean transition between them. The reality is messier. A real codebase goes through many refactorings on the same code, in sequence, over months. Each one removes a specific smell. Each one introduces (or reveals) the next smell. The literature treats refactorings as discrete moves; the actual practice treats them as a continuous walk through Fowler's catalog as the codebase teaches you what hurts.

This post is that walk for one specific surface in DOS — the hook pipeline that runs at session start. I am six weeks into a refactoring sequence I expect will run another six. Three refactorings have shipped. Three more are on the test list, with the smells named in advance and the named moves drafted but not yet applied.

I am writing the walkthrough now, in the middle of the sequence, because the parts of refactoring practice that almost never appear in blog posts are the ones I want on the record: the moves I am not yet doing, the smells I have decided to leave alone, and the ordering decision (refactoring 4 should come before 5, not after) that is currently the load-bearing call I am stewarding.

Each refactoring has a Fowler name. Each is triggered by a specific smell. Each preserves a specific invariant. The point of writing it as a sequence — rather than as a single before/after — is that the order of the refactorings matters. Doing them in the wrong order, or skipping the smaller ones, would make the later ones impossible.

What this post is not

This is not a "how to refactor" tutorial. The mechanics of each refactoring are well covered in Fowler's Refactoring (1999/2018). This post is a case study in motion — what the catalog looks like applied in sequence to one real codebase, with the smells that triggered each move and the trade-offs I am stewarding through the rest of the sequence.

The Fowler frame: smell → named refactoring → preserved invariant

Martin Fowler's Refactoring is structured as a catalog. Each entry has a consistent shape: a smell (what hurts), a named refactoring (what to do about it), the mechanics (the step-by-step move), and the trade-offs (what gets harder).

The discipline I wish someone had taught me earlier: when refactoring, name the smell first. Not "this is messy." Not "this is hard to read." A specific smell from the catalog. "Long Method." "Duplicated Code." "Feature Envy." The naming forces precision about what is actually wrong, which constrains what fix is appropriate.

Without the smell name, I have spent embarrassing hours doing pseudo-refactorings — reorganizing code to make it feel better without measurably reducing any specific pain. With the smell name, the refactoring move is constrained, and the question of "did this work" has a concrete answer.

The Uncle Bob frame: SOLID as the success criteria

Robert Martin's SOLID principles do not tell you which refactoring to apply. They tell you whether the result is structurally sound. Each refactoring should leave the code closer to one or more SOLID principles, never further from any of them.

The hook pipeline's six-month evolution is moving it from violating most of SOLID to satisfying most of SOLID. Each refactoring takes it one step closer.

SOLID principleHook pipeline at month 0Hook pipeline target
Single ResponsibilityOne file did everythingEach loader has one responsibility
Open/ClosedAdding a loader meant editing the hookAdding a loader is a new file + one registry line
Liskov SubstitutionN/A (no abstraction)All loaders implement the same Loader interface
Interface SegregationN/A (no interface)Loader interface has 4 methods (name, slot, phase, load)
Dependency InversionHook depended on every loader by nameHook depends only on Loader and the registry

Three columns, except the right one is half aspirational. The "Single Responsibility" and "Open/Closed" rows are real today, after refactorings 1 through 3. The "Liskov", "Interface Segregation", and "Dependency Inversion" rows are what refactorings 4 through 6 are supposed to deliver. I am tracking the table publicly in the codebase's CLAUDE.md so the agent — and any future contributor — has the migration target visible.

The three refactorings I have shipped

What follows is the actual chronological sequence of what has landed. Each entry: the date, the smell that triggered it, the Fowler name, what it preserved, and what it set up for the next move.

Refactorings shipped, in order

  1. Six weeks ago — Extract Function. Smell: Long Method. The session-start hook was 220 lines in one function. Pulled out loadStartupFiles(), loadRelationship(), loadLearning(), etc. as separate functions. Hook still calls them inline. Preserved: identical banner output. Set up: the next refactoring became visible because the inline structure was now legible.
  2. Four weeks ago — Replace Inline Code with Function Call. Smell: Duplicated Code. Each loader call had its own try/catch wrapper, all with the same structure. Extracted safeLoad(loader, ctx) — a single five-line function that wrapped the try/catch. Replaced seven inline try/catch blocks with seven safeLoad(...) calls. Preserved: identical error-handling behavior. Set up: the loaders no longer need to know about error handling, which makes them genuinely substitutable.
  3. Two weeks ago — Introduce Parameter Object. Smell: Long Parameter List. Loaders were taking 5-7 parameters each (project ID, user ID, session ID, hook context, fragment registry, ...). Bundled them into a single HookContext object. Loaders now take one parameter. Preserved: every loader still gets every parameter it needs. Set up: refactoring 4 (interface extraction) needs a single argument shape to be coherent.

The total duration of the first three: six weeks calendar time. The total time invested in those refactorings (separate from feature work): roughly eleven hours. The total LOC change so far: from 380 lines to about 470 (more files, smaller files). The behavior change visible to the operator: zero.

The three refactorings on the test list

This is the part of the sequence I rarely see in blog posts — the ones I have not yet done, with the smells and the moves named in advance.

Refactorings on the test list, in expected order

  1. Next sprint — Extract Interface. Smell: Implicit Contract. Every loader is currently a function with no declared shape. Plan: extract a Loader TypeScript interface with name, slot, phase, and load properties. All seven loaders implement the same interface. Will preserve: function-level behavior of every loader. Will set up: the registry pattern (next move) becomes natural — you can put Loader instances in an array because they have a common interface.
  2. Sprint after that — Replace Conditional Dispatch with Plug-in. Smell: Switch Statement / Hidden Coupling. The hook still has a long sequence of "load X, then load Y, then load Z" with implicit ordering rules ("load relationship first because the others read from its cache"). Plan: replace with a LOADERS array (the registry) plus a phase: 'pre' | 'post' declaration on each loader. The hook iterates the registry instead of having hardcoded sequence. Will preserve: the actual load order (registry entries kept in the same order). Will set up: the next refactoring (slot composition) becomes possible because load order is now data, not code.
  3. Two sprints out — Decouple Order with FragmentSlot. Smell: Temporal Coupling. Load order and compose order are the same — loaders run in the order their output appears in the banner. This is wrong: the operator's most-relevant context (active project) should appear first in the banner, but the project loader needs to run after the relationship loader (data dependency). Plan: introduce FragmentSlot enum and a separate composer that assembles banner sections in slot order. Will preserve: identical banner output. Will set up: nothing yet — this is the planned end of the sequence.

The estimated duration of the next three: another six weeks calendar time, roughly eleven more hours of focused work. The estimated final LOC: from 470 to around 600. The estimated operator-visible change: zero.

I am committing the sequence to the page now because if I lose conviction in the middle of the sequence — under a deadline, or after a tired week — the sketch is what brings me back. Every refactoring I have ever skipped past was one I did not write down in advance.

The three smells I have decided NOT to refactor

Naming the refactorings I am choosing not to do is more useful than naming the ones I do. Three smells in the current code that I have inspected and decided to leave alone:

The discipline of not refactoring is harder than the discipline of refactoring. Every smell looks like a chance to make the code better. The actual question is whether the refactoring is worth its cost — measured in time, in test churn, in cognitive load on future readers.

What every refactoring has preserved (and will preserve)

The most important property of the six-month sequence is what it does not change.

The three preserved invariants

Across all three refactorings shipped so far, three properties of the system have stayed stable:

  1. The banner output — operators see an identical structured banner before and after every refactoring.
  2. The startup latency — total session-start time stays within ±10ms across each refactoring.
  3. The error-handling behavior — a single loader failing produces the same warning and the same continued banner before and after.

Refactoring is by definition a change to internal structure that preserves external behavior. The discipline of measuring the preserved invariants — actually running the banner and confirming it is identical, actually timing the startup and confirming the latency is unchanged — is what separates refactoring from "I made some changes and hopefully nothing broke." I am committing in this essay to keeping those three invariants stable across refactorings 4 through 6 as well, with the same measured-before-and-after discipline.

What I would do differently if starting over

Three things, ordered by how much they would have saved me. Two are corrections to refactorings I already shipped; one is to a refactoring I have not done yet.

One. I would have written a characterization test in week 0. The hook pipeline did not have one for the first three weeks. I was refactoring against my memory of what it should output, which was occasionally wrong. The cost of writing the characterization test in week 0 would have been 30 minutes; the cost of not having it caused at least one regression that took an afternoon to track down. The eval-suite essay I published last week is partly the response to this: the next refactoring sequence I run on a different surface gets the characterization test before the first move.

Two. I should have done refactoring 4 (Extract Interface) earlier in the sequence. I have been delaying it because TypeScript interfaces felt like ceremony for a few loaders. They will make refactorings 5 and 6 dramatically easier; I should have done it as soon as the third loader was added. The lesson I am applying to the current refactoring 4: do not delay it again. Ship it next sprint.

Three. Refactoring 3 (Introduce Parameter Object), which I shipped two weeks ago, was correct but possibly premature. It was applied at five loaders, all of which used a similar but not identical subset of the parameters. It might have been better to wait until six or seven loaders when the actual common shape was clearer. The early version of HookContext has four fields nobody currently uses. I will know whether this was the right call after refactorings 4 and 5 land — if the unused fields turn out to be exactly what the new abstraction needs, the early move was correct; if they remain unused, I jumped a sprint too early.

NameTypeRequiredDefaultDescription
refactorings_shippedintegeryes3Distinct named refactorings applied to the hook pipeline so far.
refactorings_pendingintegeryes3Refactorings on the test list, with smells and named moves committed to in advance.
loc_change_so_farintegeryes+90Net LOC change after three refactorings. Refactorings often grow LOC by adding files; the gain is in cognitive load, not byte count.
behavior_changesintegeryes0Operator-visible behavior changes. By definition zero — refactoring preserves external behavior.
average_refactoring_durationhoursyes3.7Average time per named refactoring, projected across the full sequence.
triggersmell nameyesvariesEach refactoring is triggered by a specific named smell from Fowler's catalog. Naming the smell constrains the appropriate move.
preserved_invariantsbanner output, latency, error behavioryesall threeProperties that remain identical across every refactoring. Measured before and after each.

The honest summary of refactoring as a craft, written from the middle of a sequence: it is not the dramatic rewrite. It is the patient, sequential, named improvement of an existing thing, where each move removes a specific pain and preserves everything else. Six months total — half done. Six refactorings — three shipped. Six smells named — three more queued. Three invariants held. Zero operator-visible regressions.

That is what the catalog is for. It gives you names for what hurts, so you can apply the right move, so you can preserve the right invariants, so the operator never has to know the inside of the hook pipeline got better — they just notice that adding the eighth loader takes 15 minutes instead of 50.

I will take the patient walk over the dramatic rewrite, every time. The retrospective on the back half of the sequence ships in roughly six weeks, on a Tuesday, when refactoring 6 has been live long enough to know whether the slot-composer pulled its weight.

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch