I am writing this post the week before I start building the memory layer of DOS.
Worth saying explicitly at the top, because the temptation in design essays is to write as if the system already exists. It does not. As of today, November 3, 2025, my private DOS experiment is roughly eight weeks old (first commit September 9, 2025). It loads my CLAUDE.md at session start, runs a handful of small skills, and that is approximately the entire scaffolding. There is no memory layer yet. There is a directory in the repo called MEMORY/ that I have started filling with markdown notes from my own work, but the queryable layer — the one this post is about — has not been written.
This post is the design essay. I am writing it now so I can come back in six months and tell you what I got right and what I got wrong.
The first version of every SaaS pitch deck I have read in the last decade contains some variant of this slide:
Our data moat compounds. The longer customers use us, the more painful it becomes to leave.
This was true. I do not think it is the right shape of moat anymore.
It was true when "your data" meant a normalized PostgreSQL schema that competitors could not parse without writing a custom ETL. It is no longer true when a frontier model can read your entire export — invoices, conversation logs, internal wiki, Slack archive — and synthesize a working theory of your business in an afternoon. Export friction has approached zero. The moat that depended on it is gone.
What I want to design in its place — what DOS is being built around — is something I am calling the memory substrate.
The reframe in one line
The moat is not the data you hold. It is the co-experience the agent has accumulated alongside the operator. That co-experience is portable in principle and almost worthless to copy in practice — if you build the substrate that captures it well.
What "memory" is going to mean in DOS
The word "memory" is doing a lot of work in this thesis. Let me unpack it before I commit to building it.
When DOS finally has a memory layer (target: late November / early December 2025), I want it to be four distinct layers, each with a different lifetime, schema, and access pattern.
| Layer | What it stores | Lifetime | Backed by (planned) |
|---|---|---|---|
| Working memory | The current conversation turn | Single response | Context window |
| Session memory | The current conversation thread | Hours to days | Claude Code session log |
| Episodic memory | Past conversations, decisions, corrections | Years | Markdown files in MEMORY/WORK, MEMORY/LEARNING |
| Semantic memory | Distilled facts, relationships, beliefs | Permanent | A SQLite knowledge graph I have not yet written |
The first two layers exist in every Claude Code session today, for free. The third I have started, by hand, this past week — every PRD I write goes into MEMORY/WORK/{YYYYMMDD}-{slug}/PRD.md. The fourth is the one this design essay is about. It is the moat candidate.
The Eric Evans angle: a Bounded Context for the operator
I have spent the last six months reading and re-reading the Domain-Driven Design Blue Book. Eric Evans's central claim — that the unit of useful modeling is the Bounded Context, with its own ubiquitous language and consistent meaning of terms — is the right frame for what I am about to build, even though Evans himself was talking about cross-team systems and not personal AI.
Each operator is a Bounded Context. The terms "client," "deal," "ship," "review," "blocker" mean specific things inside the context of a single operator's working life. A "client" in my context might be a Brazilian SMB owner. A "client" in another operator's context might be a Fortune 500 CISO. The agent has to learn the operator's ubiquitous language — not generic English — to be useful.
Generic LLM memory vs. the Bounded-Context memory I want to build
- Stores facts as flat text snippets ("user mentioned client called Acme")
- Term meanings are inferred per query, inconsistently
- Cross-session facts have no relational structure
- Context window pollution with low-relevance facts
- Stores facts as triples in a knowledge graph (
Acme -- is_a -- client,Acme -- pays -- the founding-tier price,Acme -- decided_on -- 2025-11-12) - Term meanings are pinned to the operator's lexicon
- Relations are first-class — graph queries return connected facts
- Only contextually relevant subgraph loaded per request
The implementation I am converging on: a SQLite file in ~/.claude/MEMORY/, a triple schema, a small bridge tool to query it, and a hook that fires after every tool result to capture new facts. I have not picked a final name. MemPalace is the working name in my notebook.
The Greg Young angle: memory should be an event log, not a snapshot
Greg Young's argument for event sourcing — that the source of truth is the immutable append-only log of events, and current state is a left-fold of those events — is the second half of the architecture I want.
Most CRMs and project management tools store the current state. The Acme deal is in stage "Proposal Sent." The agreed price is the founding-tier price. The next action is "follow up Friday." When you query, you get the snapshot.
I want DOS to store the events. The Acme deal moved into "Proposal Sent" on a specific date, after a 47-minute call where the operator and the client discussed the substrate-vs-app distinction. The price was negotiated down from a higher initial figure to the founding-tier price after the operator agreed to deferred onboarding. The next action was set to "follow up Friday" because the client mentioned a board meeting on Thursday.
When you query the snapshot you get a status. When you query the event log you get the reasoning that produced the status. The latter is what makes the agent useful for tomorrow's call.
This is CQRS, applied to a personal context. The write side is the append-only log of every conversation, decision, correction, and outcome. The read side is the knowledge graph projection, optimized for semantic recall. The two are reconciled by replay on the rare occasions the projection drifts.
It is also, I am aware, about three abstractions stacked on top of one another for what is currently a one-person system. I am committing to the architecture in advance because I think it is the right shape if and when DOS scales beyond me — and I would rather pay the up-front cost than retrofit it later. That commitment may be wrong.
What this should give the operator that nothing else does
Three properties I am designing toward, in increasing order of strategic significance.
One: temporal queries. The operator should be able to ask "what did I decide about Acme's pricing last month, and why?" and get a faithful answer with the call transcript citation, the reasoning at the time, and the subsequent revisions. No CRM does this because no CRM stores the events that produced the state.
Two: behavioral pattern recognition. Across many months and many facts, the agent should be able to learn that I tend to over-promise on timelines in week-three of a sprint, that I correct AI-generated copy three times before approving it, that I refuse meetings before 10am. These patterns are not stated anywhere. They should be induced from the event log.
Three: portable identity. This is the moat candidate. If I switch substrate models in 18 months — Anthropic to OpenAI to a self-hosted Llama — the new agent should inherit the knowledge graph and the event log. Within minutes it knows everything the old agent knew. Within a week of operation it has updated its working theory. The substrate is replaceable. The memory is the operator's permanent asset.
That is the bet. I have not yet earned the right to say it works.
Why I think this will be hard to copy
A competitor cannot replicate the moat by buying my export. They would need to run the same months of conversations, with me, again. The moat is temporal, not informational. Time is the one resource a competitor cannot recover. If the design works.
The implementation, planned
For operators following along — and for myself, in six months when I want to remember what I committed to — here is the design as I see it today.
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| Storage engine | SQLite | yes | ~/.claude/MEMORY/knowledge_graph.sqlite3 | Local-first. No network round-trip. The simplest queryable substrate I trust. |
| Schema | Triples + temporal | yes | (subject, predicate, object, timestamp, confidence, source) | Six-tuple. Source is the session ID for traceability. |
| Query interface | Bridge tool | yes | kg_query(entity, depth=2) | Returns subgraph at depth N around the named entity. Working name only. |
| Capture trigger | Hook | yes | PostToolUse + UserPromptSubmit | Hooks fire on every tool result and prompt; capture worker batches and writes. |
| Operator override | CLAUDE.md rules | yes | User-owned constitution | Operator can declare what to remember, forget, or always-recall. |
The total moving parts I expect: one SQLite file, two hooks, one bridge tool, one capture worker. Probably under 800 lines of TypeScript. The operational complexity should be low. The strategic significance — if the bet is right — is high.
The four objections I already see coming
Writing these down now, before I lose objectivity by being inside the build:
Open problems I have not yet solved
The four hardest design questions, named honestly before the build.
- Forgetting. The graph will grow. Some facts will decay in relevance (a client who churns) but never auto-expire. Manual pruning is possible; automatic pruning risks losing facts that matter. I do not yet know the right policy.
- Privacy boundaries. Facts about other people will enter the graph through normal conversation ("Marina is the CTO at Acme"). The operator owns the graph but third parties have not consented. Local-first storage is the partial answer; a stronger one will require thinking I have not yet done.
- Confidence decay. Facts captured many months ago may be obsolete (a client's pricing tier changed) without the agent knowing to refresh. I plan to attach
last_confirmedtimestamps and prompt re-verification on facts older than ~90 days. That heuristic is untested. - Cross-context bleed. Facts from one operational context (DOS work) may appear in answers about a different context (Altyaa work). I plan to tag facts with project context at capture time. Whether that solves it cleanly is an open question.
None of these will block shipping the first version. All of them will require iteration. I name them now so that future-me has a baseline for what I expected vs. what surprised me.
What this implies for the next generation of products
If memory is the moat — if — the design implications are sharp.
Build for memory from day one. Treat session history as a first-class strategic asset. Make the export trivially easy — because an export is no longer the moat. The moat is the months of co-experience that no export can package.
If you do this right, customer "lock-in" should stop feeling adversarial. Operators stay because leaving means losing time, not because leaving is artificially expensive. That is a healthier business shape. It also happens to be the only one I think survives when frontier models commoditize the underlying capability — and as I write this on November 3, 2025, the substrate is commoditizing visibly week by week.
Memory is the substrate I want to build. Everything else — the dashboards, the integrations, the workflow automations — is replaceable. I am building for the substrate.
I will report back when the first version of this layer is real. Per current planning, that is sometime in the next four to six weeks. The post will tell you what worked and what I had to throw away.
— Lucas
Was this page helpful?





