27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Mar 17, 2026

MCP in Production: What Works When the Protocol Becomes the Boundary

Studio has been routing MCP-shaped tool calls in production for sixty-seven days. The protocol holds where I expected it to and leaks where I did not. This is the evergreen I want on the record before the next quarter of building, with Cockburn's hexagonal lens and Fowler's anti-corruption layer holding the framing. What MCP buys you in production, what it does not, and the three places I have already had to write provider-specific code despite the protocol's promise.

I am writing this on a Tuesday in mid-March, twenty-seven weeks and roughly two hundred and seventy-five commits into building DuranteOS. Studio has been live for sixty-seven days. The Hexagonal migration of the gateway is essentially complete. The first MCP-shaped tool calls were running through Studio inside the first two weeks after launch; by today the gateway has handled enough MCP traffic that I can speak honestly about what the protocol delivers in production and what I had hoped it would deliver and was wrong about.

This is the first new evergreen in the series — the prior twenty rewrote existing posts; the next seven write to the timeline as it actually develops. I am starting with MCP because the recurring lesson of the last eight dispatches has been that the protocol is the boundary. Microsoft default Claude. ServiceNow Build Agent. Cursor's first-class /mcp slash command. Figma shipping parallel Claude Code and Codex MCP integrations eight days apart. The Anthropic Skills directory. Each of those moves only works if MCP is genuinely the contract — an abstraction stable enough that integrations on either side of it survive vendor-by-vendor variation.

Sixty-seven days of production traffic later, my answer is: mostly yes, with three named leaks that an indie founder should plan for in advance.

The thesis in one sentence

MCP works in production as a hexagonal boundary in the strict Cockburn sense — but only when paired with a Fowler-style anti-corruption layer per provider. Without the ACL, the protocol's abstractions leak in three predictable places. With the ACL, MCP delivers most of what its proponents claim.

The Cockburn angle: MCP as a port, every server as a driving adapter

Alistair Cockburn's 2005 article on Hexagonal Architecture is twenty-one years old this year. The framing it offers is the cleanest possible lens on what MCP actually is: a driven port on the agent runtime side, with each MCP server playing the role of a driving adapter that supplies tools, prompts, and resources through the port's defined contract.

The shape of the protocol matches the pattern almost exactly. The agent runtime declares a stable interface (tools/list, tools/call, resources/read, prompts/get). Each MCP server implements the interface. The runtime does not need to know whether the server is talking to GitHub, Figma, a SQLite database, or a custom internal tool. From the runtime's perspective, every server is the same shape.

That is exactly what Cockburn warned twenty years ago to design toward: an application that does not know what is on the other side of its ports.

The practical payoff in Studio is a single concrete property. Adding a new MCP server does not require any change to the agent runtime, the routing layer, or the credit-metering logic. It is a registry edit and a startup-time discovery call. Sixty-seven days of production has confirmed this property holds — I have added five MCP servers since launch with zero changes to the gateway core. That is the cleanest possible evidence that the abstraction holds under load.

The Fowler angle: every adapter needs an anti-corruption layer

Where MCP fails as a strict Cockburn-style port — and where I needed Martin Fowler's anti-corruption layer concept from Patterns of Enterprise Application Architecture (2002) — is at the adapter edge of each MCP server.

The protocol abstracts the transport and the contract shape. It does not abstract the behavior of any specific upstream service. A GitHub MCP server returns issues in a shape MCP defines. The shape is consistent. The behavior — when the issues are stale, what fields are reliable, what rate limits apply, what authorization scopes the token actually grants — is GitHub-specific and leaks through the protocol no matter how clean the shape is.

Fowler's pattern is to put a small translation layer inside each adapter that converts between the upstream's actual semantics and the application's clean abstraction. In MCP-land, that means each MCP server I run benefits from a thin ACL between the raw provider behavior and the protocol-shape responses.

Layer	What MCP gives you	What you have to add
Transport	Stable JSON-RPC framing	Nothing
Contract shape	Stable tool schema	Nothing
Tool semantics	Loose suggestions only	Per-provider ACL
Error semantics	Some standardization	Per-provider error mapping
Auth scope semantics	Out of scope	Per-provider scope translation

The discipline is that you write the ACL once per provider and it pays back across every tool call that touches that provider. Without it, the agent runtime ends up handling provider-specific error shapes, provider-specific auth quirks, and provider-specific staleness assumptions inside what should be vendor-neutral logic.

The three leaks I have hit in sixty-seven days

Three concrete places where MCP's promise of vendor-portability genuinely leaked, and what the ACL pattern looks like for each.

Three named MCP leaks and the ACL response

Tool-result staleness varies by upstream. The GitHub MCP server returns issues fresh on every call. The Linear MCP server returns issues from a 30-second cache. The Notion MCP server returns blocks that may be eventually consistent for up to two minutes after a write. None of this is in the protocol contract; all of it matters at the application layer. ACL response: each adapter declares its staleness contract in metadata, and Studio's agent runtime degrades gracefully when an operator's "verify the change landed" query falls inside an upstream's staleness window.
Authorization-scope semantics are provider-specific. GitHub's repo scope grants different permissions than Linear's Issues: Read+Write. The protocol does not surface this; both servers report success on tool installation. The leak shows up at the first failed write. ACL response: each adapter exposes a required_scopes() introspection method that Studio uses at startup to detect insufficient grants before they fail at runtime — pre-flighting what MCP itself does not.
Error semantics are not standardized below the surface. The protocol defines a few error codes. The actual error meaning is provider-specific. A 400 Bad Request from one provider means "your input was malformed"; from another it means "the resource you referenced no longer exists"; from a third it means "the rate limit for this token expired forty seconds ago and the provider has not communicated that elsewhere." ACL response: per-adapter error-mapping tables that translate the provider's idiosyncratic error shape into a small set of agent-runtime-meaningful categories (transient, permanent, auth-failed, malformed).

These three leaks do not invalidate the protocol. They define the shape of work an indie founder needs to do on top of the protocol to make production deployments hold up. Every named refactoring in the hook-pipeline walkthrough had a similar shape: the abstraction was right, and the abstraction needed a small concrete companion to actually work.

What MCP actually buys you in production

Three properties I would not have without the protocol, named honestly.

What MCP delivers vs. what it does not

What MCP does not deliver

Provider-specific behavioral semantics (staleness, auth scope, error meaning)
A stable abstraction over real-world upstream availability
Zero-cost vendor swaps (you still need the ACL per provider)
Automatic horizontal scaling (you still own the runtime)
A community-maintained registry of canonical tools (each MCP server is its own ecosystem decision)

What MCP genuinely delivers

A stable transport contract — adding a server does not require touching the runtime
A stable shape contract — every server's tools look the same to the agent
Vendor-portability across coding-agent surfaces — the same MCP server works for Claude Code, Cursor, Codex, Aider with the same configuration
A discoverable runtime — agents can introspect what tools are available without prior knowledge
A composable surface — one operator can run twenty MCP servers and the agent treats them as one unified toolset

The ratio matters. Five things delivered, five things not delivered. Both lists are long. The right framing — and the framing that took me sixty-seven days to internalize — is that MCP is a baseline, not a complete solution. Below the baseline, the protocol holds and saves you significant work. Above the baseline, you write the ACL per provider and the tooling that connects MCP semantics to your application's needs.

The two production patterns that emerged

Three months in, two patterns have repeatedly justified their cost.

Pattern one: thin adapters with declared contracts. Each MCP server we add gets a small wrapper file declaring its staleness, auth-scope, and error contracts as adapter-level metadata. The runtime reads this metadata at startup and adjusts behavior accordingly — for example, retrying once on a transient error from a server that declares 30-second staleness, but failing fast on a permanent error from the same server. The cost is one file per provider. The benefit is that the runtime's vendor-neutrality holds without losing the per-vendor nuances.

Pattern two: per-server eval suites for the ACL. Every MCP server gets a small eval suite — five to ten characterization tests — that exercises its tools against the agent runtime through the protocol. The suite catches ACL drift when an upstream's behavior changes. Without these suites, the leaks listed above would have surprised me at the worst possible moment (a customer call) instead of at a Tuesday-morning eval run.

Both patterns are direct applications of the eval-suite essay from W16. Translated TDD applied to a protocol boundary is exactly the discipline MCP needs to hold in production. Without it, the boundary erodes silently the way every boundary erodes silently when no one tests it.

What this implies if you are building MCP servers

Three suggestions, in order of how much they have changed my own practice over sixty-seven days.

One. Treat the protocol as the floor, not the ceiling. MCP gives you transport and shape. You will need staleness, scope, and error semantics on top. Plan for it. Build the ACL pattern from the first server. The cost of doing it later, after the first leak surprises you in front of a customer, is much higher than doing it from day one.

Two. Write per-server eval suites. Each MCP server is a third-party integration with its own behavioral drift over time. The suite catches drift. Without the suite you will discover drift only when something breaks. Translated-TDD discipline applies here exactly.

Three. Bias toward writing your own MCP servers, not consuming third-party ones, until the third-party ecosystem matures. The ones I have run from third-party authors have averaged 3-4x the operational variance of the ones I have written myself. Some of that is inexperience on my side; most of it is that the third-party servers ship with implicit assumptions about their host environment that do not match mine. The ratio will improve as the protocol matures; for now, write what you can.

Why this evergreen is not a triumphalist post

MCP has been over-sold in some quarters — partly by Anthropic itself, partly by the broader ecosystem — as solving the vendor-portability problem completely. It does not. It solves the transport and shape problem completely, which is a meaningful and underrated win. The remaining work — staleness, scope, errors, eval discipline — is real and is what determines whether your production deployment holds. Anyone telling you MCP is the whole story has not run it in production. Anyone telling you MCP is no value is wrong in the other direction. The truthful version is the boring one: it is a strong floor that needs application-specific scaffolding above it.

What this means for Studio

Two design decisions hardened across sixty-seven days of production:

One. The Studio gateway treats every MCP server as a driven adapter in the Hexagonal sense. The Provider port from the W15 hexagonal essay is not different in shape from the MCP-server port — both are driven adapters supplying capability through a uniform contract. This made the adapter-pattern unification straightforward: a single registry of adapters, half of them frontier-model providers, half of them MCP servers. Each adapter declares its staleness, scope, and error contracts as metadata. The runtime treats them uniformly.

Two. The first DOS pack to ship as a public MCP server — Sentinel, sketched in W8 — gets the full ACL treatment from day one, not retroactively. The lesson cost too much in the first three months to apply selectively going forward.

The three-layer model

MCP fits inside the Skills layer, exposed as Pack capability.

Ports + adapters in Studio

The Provider port that MCP integrates into.

The gateway architecture

Where MCP traffic gets metered.

Eval suites at the protocol boundary

The discipline that catches ACL drift.

The protocol-as-boundary thesis I have been sketching since November is now stress-tested by sixty-seven days of production. The thesis holds. The discipline that makes it hold — anti-corruption layers per provider, eval suites per server, treating the protocol as a floor rather than a ceiling — is the work the original Cockburn-and-Fowler reading list has been preparing me for since I picked it up two decades ago.

The protocol is the boundary. The ACL is the work. Both together are how MCP earns its keep in production.

The retrospective on the back half of the year ships at the end of June, after another quarter of MCP traffic has either confirmed or revised these lessons. The discipline, by then, will either be vindicated or replaced. Either result is more useful than continuing to operate without commitment to the shape.

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat