Skip to content
DuranteDurante
ALL SYSTEMSGet Access

27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

Multi-copy systems are an architectural smell — until they are an architectural necessity. Eleven weeks into building DOS I can already see the duplication problem coming as soon as I start distributing skills outside my own machine. This is the design essay for the sync pattern I want to commit to before the smell gets bad — what Fowler and Feathers would say, and what I am going to mechanize early to avoid pain later.

I am writing this from inside an architecture problem I have not yet had, but can already see coming.

DOS today, eleven weeks in, has one canonical location for skills, hooks, and shared tools — ~/.claude/. That works because there is exactly one operator (me), one machine, one install. The moment I start distributing DOS to anyone else — pack source for marketplace consumers, a versioned submodule for the cc-durante-studio repo (which doesn't exist yet either), an autonomous-agent runtime that is not invoked through the Skill tool — the same artifact will need to live in multiple places simultaneously.

That is the smell I want to commit to handling well before I have it. Because the most embarrassing failure mode I can imagine in a young system is the one where I edit a hook in one place, ship it, and three days later discover the change silently regressed because a second copy of the same hook in a different directory was not updated.

I do not want to live through that pattern eight times before I admit it is structural. This post is the design essay for the sync pattern I want to commit to before the smell gets bad.

The bet behind writing this now

Mechanizing an invariant before you need it is overkill. Mechanizing it after the fifth time it bites you is technical debt. The right window is the moment you can see it coming clearly enough to draft the manifest schema. I think I am at that window today.

What "four copies" will mean as DOS distributes

Today there is one copy. Within the next 6-12 weeks, as I open up DOS to early operators and start packaging packs for distribution, the same artifact will need to exist in up to four locations. The four are not historical accidents — each will serve a different consumer.

#Location (planned)PurposeRead by
1~/.claude/skills/The live installationClaude Code at runtime
2Releases/v0.x.x/.claude/skills/The versioned submodule for distributionOperators installing the marketplace tarball (future)
3Packs/{Name}/src/The pack source distributionPack marketplace consumers (future)
4Packs/Agents/The autonomous agent runtimeTypeScript executable agents (future)

Each consumer will read from a different copy. A single artifact (say, a Sentinel skill I am sketching right now) will need to exist in copies 1, 2, and 3 simultaneously. They must be byte-identical for the system to behave consistently.

I am asking the obvious question before I commit: why not collapse the copies into one? I have already thought through three single-copy approaches in my notebook, and each fails for a different consumer-specific reason — submodule packaging needs a self-contained layout, pack marketplace needs its own license file and README, the live installation needs path-resolved imports that customers cannot reproduce. The duplication is structural, not accidental.

The Fowler angle: Duplicated Code is a smell, not a sin

Martin Fowler's Refactoring lists "Duplicated Code" as the first code smell in chapter three. The catalog of refactorings to remove it (Extract Method, Extract Superclass, Pull Up Field) all assume the duplication is accidental — the same intent expressed twice because nobody noticed.

Fowler is also explicit elsewhere that duplication is sometimes the right answer when the intent is the same but the consumers have different lifecycles. His bliki entry on Microservices makes this point: shared code across microservices is more dangerous than duplicated code, because shared code couples deploys.

The four-copy DOS architecture I am planning applies the same logic to artifact distribution. Each copy serves a consumer with a different deploy cycle:

Why the duplication will be structural, not accidental

Single-copy attempt (rejected)
  • One canonical location, all consumers read from it via symlink or include-by-reference
  • Submodule build breaks because customers cannot pull a path that lives outside the submodule
  • Pack marketplace breaks because pack source needs a self-contained directory with its own license, README, and package.json
  • Live installation works fine — but it is the only one that works
Multi-copy with mechanized invariant (the plan)
  • Each consumer reads from a copy that lives where it expects
  • A canonical sync tool with a hash-comparison invariant catches drift
  • Manifest declares which copies are paired, which differ in name (aliases), which intentionally diverge
  • --fix direction encoded in the manifest: live → submodule (live is the operator's working copy)
  • Drift becomes a CI signal, not a "remember to copy" discipline

The right column is the approach I am committing to. Duplication preserved; unmanaged duplication eliminated. The distinction matters because Fowler's smell is about cognitive cost (developer has to think about it) rather than physical cost (bytes on disk). Mechanizing the sync should remove the cognitive cost while keeping the physical layout that consumers will require.

The Feathers angle: characterize before you refactor (or build)

Michael Feathers in Working Effectively with Legacy Code has a procedural rule that I want to internalize before I have legacy to work with. Before you refactor a system whose behavior you do not fully understand, write a characterization test — one that captures the current behavior, even if you suspect the current behavior is wrong.

The version of this rule I want to apply to the four-copy problem: before I ship the sync tool, write down what I expect the legitimate divergences to be, in advance. Three I am already anticipating:

  1. The submodule and live copy will be intentionally identical when the operator is in symlink-mode (developer machine).
  2. The submodule and live copy will be intentionally different when the operator is in customer-mode (marketplace install) — customer installs the tarball; the live submodule pointer is unused.
  3. Pack source and live/submodule will have known naming differences (e.g., I am already planning bridge.py in pack source vs mempalace_bridge.py in live to avoid path collisions inside ~/.claude/) that are NOT drift.

None of these are obvious from outside the system. All of them will matter for designing the sync tool. Without writing them down before I build, the sync tool will flag the intentional differences as bugs, and I will discover the mistake in week three when the false positives drown the real signal.

The characterization output, before I have written a line of it

The plan is a .dos-sync-manifest.json file with three blocks: pairs (what should match), aliases (known naming differences that are NOT drift), and live_only (intentional divergences). The manifest is the test list. The sync tool will be the runner. I will write the manifest first.

The sync-check tool I want to build

The mechanized invariant should ship as a single TypeScript script. The interface I am drafting:

bun ~/Durante/Tools/sync-check.ts            # default: summary counts
bun ~/Durante/Tools/sync-check.ts --full     # per-file status table
bun ~/Durante/Tools/sync-check.ts --json     # machine-readable
bun ~/Durante/Tools/sync-check.ts --fix      # resolve drift: live → submodule
bun ~/Durante/Tools/sync-check.ts --fix --dry-run

# Exit codes: 0 = synced, 1 = drift detected, 2 = manifest error

The implementation should be mechanical. Walk every pair declared in the manifest. Compute sha256 of both sides. Compare. Honor aliases. Report drift. Optionally apply --fix in the canonical direction.

What a single sync-check run will do (planned)

  1. Load .dos-sync-manifest.json and validate its shape (pairs[], aliases{}, live_only[]).
  2. For each declared pair: resolve both paths; if either is missing, flag and continue.
  3. Compute sha256 hash of each side. Compare.
  4. If hashes match → SYNCED.
  5. If hashes differ but the pair is in aliases with a declared exception → SYNCED.
  6. If hashes differ → DRIFT. Record the smaller-side path as the source of staleness.
  7. After all pairs walked: emit summary (counts) or per-file table (--full).
  8. If --fix flag set: copy the live-side content to the submodule-side path for each DRIFT entry.
  9. Exit 0 if no drift, 1 if drift detected, 2 if manifest invalid.

I expect the tool to come in around 300-400 lines of TypeScript. The manifest probably 100-200 lines of JSON to start. Together they should replace the future "did I remember to copy that?" question with a single command that exits non-zero if I forgot.

I plan to write this in the next two to four weeks, before there is enough multi-copy state to make manual sync painful. The bet is that I will get it wrong on first draft, but the cost of getting it wrong before I have ten skills in three locations is much lower than the cost of getting it wrong after.

What I think this teaches about duplication, in advance

Three lessons I want to commit to, before I have fully earned them:

What the manifest should contain (drafted)

For operators who want to build something analogous, here is the schema I am committing to:

NameTypeRequiredDefaultDescription
pairsarray of {a, b}yes[]Each pair is two paths that should hash-match. Walked left-to-right; if a is canonical, b should equal a.
aliasesobject {a_path: b_path}no{}Declared naming differences. The two files have different names but the same content; sync still verifies content equality.
live_onlyarray of pathsno[]Files that intentionally exist only on the live side and should never trigger 'missing from submodule' alarms.
submodule_onlyarray of pathsno[]Mirror of live_only for the other side. Almost never used.
fix_directionlive_to_submodule | submodule_to_liveyeslive_to_submoduleEncoded canonicality. The --fix flag copies in this direction only.

What this implies if you have multi-copy artifacts (or will)

You probably already have them, even if you do not call them that. Examples I have seen in other codebases:

  • A monorepo with packages/foo/dist/ checked into git alongside packages/foo/src/ — two copies, one canonical
  • A backend service with API types duplicated in backend/types.ts and frontend/types.ts — two copies, one canonical
  • An internal SDK with three language bindings (TypeScript, Python, Go) generated from a single OpenAPI spec — three copies, one canonical
  • A documentation site that ingests README files from multiple package roots — N copies, N canonical

In every case, the question is the same: is the canonicality mechanically enforced or aspirationally requested? If it is aspirational ("we should remember to update both"), it will drift. If it is mechanical (a script that checks and exits non-zero), it will not.

The quiet lesson I want to commit to is the durable one: make the invariant mechanical, then forget about it. The bytes on disk can multiply freely as long as the relationship between them is enforced by something other than human attention.

I have not lost a hook to multi-copy drift yet. I would like to keep it that way by writing the tool before I have copies to lose.

— Lucas

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch