27 semanas · 54 textos · Escritos durante a construção

Notas de campo de um SO de IA pessoal em voo

Toda terça-feira, um ensaio perene sobre o que aprendi enquanto envio o DuranteOS. Toda sexta, um boletim da semana. Cerca de 108 mil palavras e contando — para construtores que preferem ver a fundação ser lançada a ler o press release.

Comece por aqui

Se você está pensando em escrever enquanto constrói

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

Se você está considerando o grupo fundador

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

Se você quer a espinha arquitetônica

The Decomposition Discipline I Am Trying to Codify Inside DOS

Assinar · Ensaio de terça

Cerca de 3.800 construtores leem isso toda semana.

Mar 20, 2026

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

On Thursday Cursor shipped Composer 2 as its default agentic coding model — and within twenty-four hours sleuths confirmed via API headers that it is a fine-tune of Kimi K2.5. The IDE became a model lab in nine months. OpenAI made GPT-5.4 mini free on Tuesday. The DOD escalated the Anthropic case to 'unacceptable risk to national security' Wednesday, with a preliminary-injunction hearing fast-tracked to next week. Anthropic published an 81,000-person user study the same day. Perplexity made the $200/month Comet browser free on iPhone. The frontier is unbundling at the agent layer. The coding-tool moat just became a training run, not a wrapper.

Two months ago Cursor was an IDE running on top of Claude. As of Thursday morning Cursor is also a model lab.

Composer 2, Cursor's new default agentic coding model, shipped Mar 19. It scores 61.7 on Terminal-Bench 2.0 and 73.7 on SWE-bench Multilingual — frontier-adjacent territory, beats Claude Opus 4.6, trails GPT-5.4 by a few points. It costs $0.50 per million input tokens and $2.50 per million output — roughly eighty-six percent cheaper than the Composer 1.5 it replaced. Within twenty-four hours of launch, independent observers pulled the model identifier from API headers and confirmed Composer 2 is a fine-tune of Kimi K2.5. Cursor had not disclosed the base.

That sequence is the entire story of the week, in microcosm. The unbundling happens vertically — Cursor moves down-stack into model training. The base model leaks horizontally — twenty-four hours of API traffic is enough for the ecosystem to see what is under the hood. The price collapses publicly — frontier-adjacent capability at one-fifth the rate. And the substrate vendor whose seat was being supplied here (Anthropic) spends the same week getting escalated to "unacceptable risk to national security" by the federal government.

Five days, four labs, one signal: the coding-tool moat is now a training run, not a wrapper. Indie founders who built on top of frontier APIs as their entire product just had their architecture priced as a feature rather than a moat.

I am writing this on a Friday twenty-seven weeks into building DOS, with the Tuesday MCP-in-production essay live alongside this dispatch. The two posts read as the same argument from two angles — the Tuesday essay says the protocol boundary is what makes indie tooling vendor-portable, and the Friday dispatch records the week vendor-portability got publicly priced as the only credible architecture for tooling that does not own its own model.

The week's signal in one sentence

Coding-tool moats are now training runs, not wrappers. If your product runs on someone else's frontier API, the people you used to compete with are now training their own models — and your differentiation just became "we are the wrapper version of the same thing." Plan accordingly, or accept the commoditization curve.

The hook: Cursor's unbundling

The single most consequential thing of the five-day window landed Thursday morning.

On Thu Mar 19, Anysphere released Cursor Composer 2 (source). New default agentic coding model. Frontier-adjacent benchmarks: 61.7 on Terminal-Bench 2.0, 73.7 on SWE-bench Multilingual. Beats Claude Opus 4.6 on multilingual; trails GPT-5.4 by 6-7 points. Pricing $0.50 input / $2.50 output per million tokens — roughly eighty-six percent cheaper than Composer 1.5.

Within twenty-four hours, ecosystem observers pulled the underlying model identifier from API response headers and confirmed Composer 2 is a fine-tune of Kimi K2.5 — Moonshot's open-weights model. Cursor had not disclosed the base in the launch post. The disclosure happened anyway, in public, by the protocol's own affordances.

Read the strategic shape of this. Cursor was, two months ago, an IDE running on top of frontier-vendor APIs. As of this week, Cursor is the IDE, the agent-runtime layer, the orchestration tier, and the fine-tuned model that runs by default for paying customers. Vertical integration, executed in nine calendar months, with the base model selected from the open-weights tier (Kimi K2.5 — Chinese open-weights, the same camp GLM-5 came from in W16).

For an indie founder building on Claude Code or any frontier-API product, the implication is sharp. The competitor whose product looked similar to yours two months ago is now competing on a different stack. Their margins are different. Their distribution is different. Their roadmap has new options yours does not. The wrapper-tooling moat just became asymmetric — and the asymmetry runs against you.

The two-bit lesson from the Composer 2 launch is the bigger one: the coding-tool space is unbundling at the model layer. The IDE used to be a delivery vehicle for someone else's frontier model. Now the IDE is the lab, with the open-weights model as starting clay.

The DOD escalation

The procurement story from W19 and W20 escalated this week.

On Wed Mar 18, the Department of Defense filed a forty-page response (source) declaring Anthropic an "unacceptable risk to national security." The filing's load-bearing argument is that Anthropic's published red lines (autonomous weapons, mass surveillance) mean the company might "disable its technology" mid-operation — a posture the DOD frames as incompatible with reliable defense procurement.

The interesting complication: OpenAI, Google, and Microsoft employees filed amicus briefs supporting Anthropic. Not the companies — the individual employees. The signal cuts across vendor lines: frontier-AI workers across multiple labs see the procurement framing as a precedent that hits all of them, not just the one being banned today. The preliminary-injunction hearing was fast-tracked to Tuesday Mar 24 — next week.

For an indie founder, the implication compounds W19's argument. The frontier vendor your product depends on is now in active legal conflict over the right to keep behavioral red lines on its model. The litigation is moving faster than the news cycle. By next Friday's dispatch, we will know whether the preliminary injunction holds — and the answer will materially shift the procurement-risk calculation for every vendor downstream of the same federal customer.

The price collapse on the small end

OpenAI did not stay quiet on capability either.

On Tue Mar 17, OpenAI shipped GPT-5.4 mini and nano (source). Mini hits the ChatGPT free tier — the previously-paywalled smarter-tier capability is now default for non-paying users. Nano stays API-only. The Mini scores within striking distance of GPT-5.4 on SWE-Bench Pro and OSWorld-Verified at more than two-times faster inference.

This is the mirror image of Cursor's move. Cursor went down-stack into model training. OpenAI went down-stack into pricing — the smaller model, frontier-adjacent on coding-agent benchmarks, costs zero for the consumer surface and pennies for the API. The two motions converge on the same outcome: the cost of competent agentic coding is collapsing toward zero from both sides simultaneously.

For an indie founder, the implication is direct. If your product's value lived in "we give you access to a smart coding agent," the floor of that value just dropped. The price arbitrage is no longer GLM-5-vs-Opus or M2.5-vs-Sonnet; the new arbitrage runs through Composer 2, GPT-5.4 mini, and the open-weights tier all at once.

The other two stories

Two more items rounded out the week.

Wed Mar 18 — Anthropic publishes 81,000-person user study (source). 81K respondents across 159 countries and 70 languages, conducted by an internal "Anthropic Interviewer" agent. Headline finding: the things people love most about AI are the same things they fear most. Beyond the qualitative data, the study itself is an artifact — Anthropic running the largest agent-conducted user research effort on record, the same week it is fighting the DOD.
Wed Mar 18 — Perplexity Comet free on iPhone (source). The $200/month browser drops its paywall on iOS. Hit #3 overall on the App Store within days. The pricing arc: Comet went from $200/month to free in nine months. The defensible price point for a single-purpose AI agent, in the consumer surface, is approaching zero.

Both items confirm the unbundling thesis at different layers. Anthropic's user study positions the company as a research peer, not a procurement vendor. Comet's zero-pricing demonstrates that distribution, not capability, is the contested territory in the consumer AI surface.

The pattern: unbundling everywhere

The five-day window in one frame

The 'frontier API as substrate' frame (closed Thursday morning)

Frontier vendor as the only credible base model
Coding-tool moats as well-engineered wrappers over frontier APIs
Pricing power on consumer agents at $20-$200/month tiers
Distribution dependent on substrate vendor's enterprise channels
Indie tooling competes on quality of integration

The 'unbundling at every layer' frame (this week's evidence)

Cursor builds its own model on Kimi K2.5 — IDE becomes lab
GPT-5.4 mini free on consumer tier — small-model pricing collapsing toward zero
Comet $200 → $0 in nine months — consumer-agent pricing power evaporating
Anthropic running massive user studies + suing the federal government — substrate vendor crossing into research and litigation simultaneously
Indie tooling competes on whichever moat survives unbundling at every layer

If you wanted evidence for the year's prevailing narrative — "the substrate is plural, the protocols are public goods, the workflow layer is the moat, the rules layer is being codified, the hyperscalers are vertically integrating, the flagship is agentic-coding-shaped, the substrate has bifurcated, agentic primitives commoditize, the moat is above the model, procurement is the new benchmark, surface area beats concentration" — this week added the twelfth refinement: unbundling is happening at every layer simultaneously. Indie founders need to pick which layer's moat they intend to compete on, because the layers below them are commoditizing faster than any single architectural commitment can keep up with.

Two angles for an indie founder

What an indie founder building on the substrate should do this week

Coding-tool moats are training runs, not wrappers — plan for transparency. Cursor went from IDE-on-top-of-Claude to fine-tuning Kimi K2.5 in-house. If you sell an agent product that uses someone else's model, expect customers to ask "whose model is this, really?" — and expect base-model attribution to leak via API headers within twenty-four hours of any meaningful launch. Two paths forward. Path one: get explicit about what you do and do not train, what you fine-tune, what you wrap, what you orchestrate. Path two: actually train your own. Either is defensible. The murky middle — wrapping someone else's model and pretending it's your own capability — is the position the ecosystem will price down hardest in 2026.
Pricing power is collapsing on consumer agents. Comet went $200 to $0 in nine months. GPT-5.4 mini is free. The defensible price point for a single-purpose AI agent on the consumer surface is approaching zero. Durable revenue lives in vertical workflows, not horizontal access. If your product's value proposition is "we wrap a smart agent for you," the floor of that proposition just dropped under your feet. Reposition toward the workflow — the operator-context, the per-vertical conventions, the integration with the customer's actual process — because that is the only piece neither substrate vendors nor the IDE-turned-labs can vertically commoditize.
Watch the Tuesday hearing. The DOD's Mar 24 preliminary-injunction hearing is the single most important data point of the next ten days for anyone building on Claude. If the injunction holds, Anthropic's federal procurement story stays viable through the trial. If it does not, the federal-procurement market closes for Anthropic indefinitely, and the W19/W20 procurement-risk argument acquires teeth across enterprise sectors that watch federal precedents closely. Either result reshapes the substrate-availability calculation for everyone downstream. Make sure you are not learning the outcome from a customer call.

What this changes for DOS

Two design decisions hardened this week, both of them coming out of the unbundling framing rather than any single news item.

One. Studio's Provider adapter table grows by two: one for Cursor-API-compatible Composer 2 access (since Cursor is now an upstream provider in addition to being an IDE), and one for direct Kimi K2.5 access via OpenRouter (since the base model that Composer 2 fine-tuned from is itself a viable inner-loop call). The architecture costs about half a day of adapter work. The benefit is that DOS users routing through Studio can choose Composer 2 for high-cost coding tasks and Kimi K2.5 for inner-loop calls, with the capability-difference between them visible to the routing logic. The unbundling at the model layer becomes routing optionality at the gateway layer.

Two. The Studio pitch deck — currently being prepared for the founding-customer conversations the W14 essay walked through — now leads with "we route across substrates including the labs, the IDEs-turned-labs, and the open-weights tier" rather than "we route across substrates." Naming Cursor as a substrate (not a competitor or an alternative product) inverts the conventional positioning and matches the actual market shape after Thursday.

That is the kind of forcing function the unbundling-everywhere week is structurally designed to produce. The coding-tool space, the consumer-agent space, the substrate-procurement space, and the user-research space all moved at once — and the indie founder who reads them as one signal positions correctly for the next quarter.

What I am watching for next week

The thread that runs through the week

Cursor became a model lab. OpenAI made smart agentic coding free on consumer. The DOD escalated Anthropic's procurement designation. Anthropic published the largest agent-conducted user study on record. Comet zeroed out its consumer pricing. Five days. Five vectors. One pattern: the layers underneath indie tooling are commoditizing in parallel, and indie founders need to pick which layer's moat they intend to compete on, deliberately, before the layer below them prices their product down.

For an indie founder, the playbook reduces. Build to the protocol boundary (MCP, AGENTS.md, Skills) so vendor-portability is real. Choose your moat layer deliberately — workflow, memory, conventions, evaluation, sovereignty-aware routing — and double down. Do not build a wrapper whose differentiation can be replicated in nine months by the IDE that hosted you.

That is what Studio is becoming. The unbundling-everywhere week made the architecture argument operational at four layers simultaneously, and the moat — the operator loop on top — is now the only piece I am betting won't be commoditized by the layer below.

— Lucas

Sources verified the week of Mar 16-20, 2026: Cursor Composer 2 launch (Mar 19) · 9to5Mac on GPT-5.4 mini & nano (Mar 17) · TechCrunch on DOD escalation (Mar 18) · Anthropic 81k-interviews study (Mar 18) · MacRumors on Comet free on iPhone (Mar 18)

Was this page helpful?

O arco de 27 semanas · Um corpo de trabalho

Vinte e sete semanas. Dois textos por semana. Seis meses de escrita durante a construção.

Semana

Ensaio de terça

Boletim de sexta

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat