Pular para o conteúdo
DuranteDurante
ALL SYSTEMSGet Access

27 semanas · 54 textos · Escritos durante a construção

Notas de campo de um SO de IA pessoal em voo

Toda terça-feira, um ensaio perene sobre o que aprendi enquanto envio o DuranteOS. Toda sexta, um boletim da semana. Cerca de 108 mil palavras e contando — para construtores que preferem ver a fundação ser lançada a ler o press release.

Assinar · Ensaio de terça

Cerca de 3.800 construtores leem isso toda semana.

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

On Thursday Cursor shipped Composer 2 as its default agentic coding model — and within twenty-four hours sleuths confirmed via API headers that it is a fine-tune of Kimi K2.5. The IDE became a model lab in nine months. OpenAI made GPT-5.4 mini free on Tuesday. The DOD escalated the Anthropic case to 'unacceptable risk to national security' Wednesday, with a preliminary-injunction hearing fast-tracked to next week. Anthropic published an 81,000-person user study the same day. Perplexity made the $200/month Comet browser free on iPhone. The frontier is unbundling at the agent layer. The coding-tool moat just became a training run, not a wrapper.

Two months ago Cursor was an IDE running on top of Claude. As of Thursday morning Cursor is also a model lab.

Composer 2, Cursor's new default agentic coding model, shipped Mar 19. It scores 61.7 on Terminal-Bench 2.0 and 73.7 on SWE-bench Multilingual — frontier-adjacent territory, beats Claude Opus 4.6, trails GPT-5.4 by a few points. It costs $0.50 per million input tokens and $2.50 per million output — roughly eighty-six percent cheaper than the Composer 1.5 it replaced. Within twenty-four hours of launch, independent observers pulled the model identifier from API headers and confirmed Composer 2 is a fine-tune of Kimi K2.5. Cursor had not disclosed the base.

That sequence is the entire story of the week, in microcosm. The unbundling happens vertically — Cursor moves down-stack into model training. The base model leaks horizontally — twenty-four hours of API traffic is enough for the ecosystem to see what is under the hood. The price collapses publicly — frontier-adjacent capability at one-fifth the rate. And the substrate vendor whose seat was being supplied here (Anthropic) spends the same week getting escalated to "unacceptable risk to national security" by the federal government.

Five days, four labs, one signal: the coding-tool moat is now a training run, not a wrapper. Indie founders who built on top of frontier APIs as their entire product just had their architecture priced as a feature rather than a moat.

I am writing this on a Friday twenty-seven weeks into building DOS, with the Tuesday MCP-in-production essay live alongside this dispatch. The two posts read as the same argument from two angles — the Tuesday essay says the protocol boundary is what makes indie tooling vendor-portable, and the Friday dispatch records the week vendor-portability got publicly priced as the only credible architecture for tooling that does not own its own model.

The week's signal in one sentence

Coding-tool moats are now training runs, not wrappers. If your product runs on someone else's frontier API, the people you used to compete with are now training their own models — and your differentiation just became "we are the wrapper version of the same thing." Plan accordingly, or accept the commoditization curve.

The hook: Cursor's unbundling

The single most consequential thing of the five-day window landed Thursday morning.

On Thu Mar 19, Anysphere released Cursor Composer 2 (source). New default agentic coding model. Frontier-adjacent benchmarks: 61.7 on Terminal-Bench 2.0, 73.7 on SWE-bench Multilingual. Beats Claude Opus 4.6 on multilingual; trails GPT-5.4 by 6-7 points. Pricing $0.50 input / $2.50 output per million tokens — roughly eighty-six percent cheaper than Composer 1.5.

Within twenty-four hours, ecosystem observers pulled the underlying model identifier from API response headers and confirmed Composer 2 is a fine-tune of Kimi K2.5 — Moonshot's open-weights model. Cursor had not disclosed the base in the launch post. The disclosure happened anyway, in public, by the protocol's own affordances.

Read the strategic shape of this. Cursor was, two months ago, an IDE running on top of frontier-vendor APIs. As of this week, Cursor is the IDE, the agent-runtime layer, the orchestration tier, and the fine-tuned model that runs by default for paying customers. Vertical integration, executed in nine calendar months, with the base model selected from the open-weights tier (Kimi K2.5 — Chinese open-weights, the same camp GLM-5 came from in W16).

For an indie founder building on Claude Code or any frontier-API product, the implication is sharp. The competitor whose product looked similar to yours two months ago is now competing on a different stack. Their margins are different. Their distribution is different. Their roadmap has new options yours does not. The wrapper-tooling moat just became asymmetric — and the asymmetry runs against you.

The two-bit lesson from the Composer 2 launch is the bigger one: the coding-tool space is unbundling at the model layer. The IDE used to be a delivery vehicle for someone else's frontier model. Now the IDE is the lab, with the open-weights model as starting clay.

The DOD escalation

The procurement story from W19 and W20 escalated this week.

On Wed Mar 18, the Department of Defense filed a forty-page response (source) declaring Anthropic an "unacceptable risk to national security." The filing's load-bearing argument is that Anthropic's published red lines (autonomous weapons, mass surveillance) mean the company might "disable its technology" mid-operation — a posture the DOD frames as incompatible with reliable defense procurement.

The interesting complication: OpenAI, Google, and Microsoft employees filed amicus briefs supporting Anthropic. Not the companies — the individual employees. The signal cuts across vendor lines: frontier-AI workers across multiple labs see the procurement framing as a precedent that hits all of them, not just the one being banned today. The preliminary-injunction hearing was fast-tracked to Tuesday Mar 24 — next week.

For an indie founder, the implication compounds W19's argument. The frontier vendor your product depends on is now in active legal conflict over the right to keep behavioral red lines on its model. The litigation is moving faster than the news cycle. By next Friday's dispatch, we will know whether the preliminary injunction holds — and the answer will materially shift the procurement-risk calculation for every vendor downstream of the same federal customer.

The price collapse on the small end

OpenAI did not stay quiet on capability either.

On Tue Mar 17, OpenAI shipped GPT-5.4 mini and nano (source). Mini hits the ChatGPT free tier — the previously-paywalled smarter-tier capability is now default for non-paying users. Nano stays API-only. The Mini scores within striking distance of GPT-5.4 on SWE-Bench Pro and OSWorld-Verified at more than two-times faster inference.

This is the mirror image of Cursor's move. Cursor went down-stack into model training. OpenAI went down-stack into pricing — the smaller model, frontier-adjacent on coding-agent benchmarks, costs zero for the consumer surface and pennies for the API. The two motions converge on the same outcome: the cost of competent agentic coding is collapsing toward zero from both sides simultaneously.

For an indie founder, the implication is direct. If your product's value lived in "we give you access to a smart coding agent," the floor of that value just dropped. The price arbitrage is no longer GLM-5-vs-Opus or M2.5-vs-Sonnet; the new arbitrage runs through Composer 2, GPT-5.4 mini, and the open-weights tier all at once.

The other two stories

Two more items rounded out the week.

  • Wed Mar 18 — Anthropic publishes 81,000-person user study (source). 81K respondents across 159 countries and 70 languages, conducted by an internal "Anthropic Interviewer" agent. Headline finding: the things people love most about AI are the same things they fear most. Beyond the qualitative data, the study itself is an artifact — Anthropic running the largest agent-conducted user research effort on record, the same week it is fighting the DOD.

  • Wed Mar 18 — Perplexity Comet free on iPhone (source). The $200/month browser drops its paywall on iOS. Hit #3 overall on the App Store within days. The pricing arc: Comet went from $200/month to free in nine months. The defensible price point for a single-purpose AI agent, in the consumer surface, is approaching zero.

Both items confirm the unbundling thesis at different layers. Anthropic's user study positions the company as a research peer, not a procurement vendor. Comet's zero-pricing demonstrates that distribution, not capability, is the contested territory in the consumer AI surface.

The pattern: unbundling everywhere

The five-day window in one frame

The 'frontier API as substrate' frame (closed Thursday morning)
  • Frontier vendor as the only credible base model
  • Coding-tool moats as well-engineered wrappers over frontier APIs
  • Pricing power on consumer agents at $20-$200/month tiers
  • Distribution dependent on substrate vendor's enterprise channels
  • Indie tooling competes on quality of integration
The 'unbundling at every layer' frame (this week's evidence)
  • Cursor builds its own model on Kimi K2.5 — IDE becomes lab
  • GPT-5.4 mini free on consumer tier — small-model pricing collapsing toward zero
  • Comet $200 → $0 in nine months — consumer-agent pricing power evaporating
  • Anthropic running massive user studies + suing the federal government — substrate vendor crossing into research and litigation simultaneously
  • Indie tooling competes on whichever moat survives unbundling at every layer

If you wanted evidence for the year's prevailing narrative — "the substrate is plural, the protocols are public goods, the workflow layer is the moat, the rules layer is being codified, the hyperscalers are vertically integrating, the flagship is agentic-coding-shaped, the substrate has bifurcated, agentic primitives commoditize, the moat is above the model, procurement is the new benchmark, surface area beats concentration" — this week added the twelfth refinement: unbundling is happening at every layer simultaneously. Indie founders need to pick which layer's moat they intend to compete on, because the layers below them are commoditizing faster than any single architectural commitment can keep up with.

Two angles for an indie founder

What an indie founder building on the substrate should do this week

  1. Coding-tool moats are training runs, not wrappers — plan for transparency. Cursor went from IDE-on-top-of-Claude to fine-tuning Kimi K2.5 in-house. If you sell an agent product that uses someone else's model, expect customers to ask "whose model is this, really?" — and expect base-model attribution to leak via API headers within twenty-four hours of any meaningful launch. Two paths forward. Path one: get explicit about what you do and do not train, what you fine-tune, what you wrap, what you orchestrate. Path two: actually train your own. Either is defensible. The murky middle — wrapping someone else's model and pretending it's your own capability — is the position the ecosystem will price down hardest in 2026.
  2. Pricing power is collapsing on consumer agents. Comet went $200 to $0 in nine months. GPT-5.4 mini is free. The defensible price point for a single-purpose AI agent on the consumer surface is approaching zero. Durable revenue lives in vertical workflows, not horizontal access. If your product's value proposition is "we wrap a smart agent for you," the floor of that proposition just dropped under your feet. Reposition toward the workflow — the operator-context, the per-vertical conventions, the integration with the customer's actual process — because that is the only piece neither substrate vendors nor the IDE-turned-labs can vertically commoditize.
  3. Watch the Tuesday hearing. The DOD's Mar 24 preliminary-injunction hearing is the single most important data point of the next ten days for anyone building on Claude. If the injunction holds, Anthropic's federal procurement story stays viable through the trial. If it does not, the federal-procurement market closes for Anthropic indefinitely, and the W19/W20 procurement-risk argument acquires teeth across enterprise sectors that watch federal precedents closely. Either result reshapes the substrate-availability calculation for everyone downstream. Make sure you are not learning the outcome from a customer call.

What this changes for DOS

Two design decisions hardened this week, both of them coming out of the unbundling framing rather than any single news item.

One. Studio's Provider adapter table grows by two: one for Cursor-API-compatible Composer 2 access (since Cursor is now an upstream provider in addition to being an IDE), and one for direct Kimi K2.5 access via OpenRouter (since the base model that Composer 2 fine-tuned from is itself a viable inner-loop call). The architecture costs about half a day of adapter work. The benefit is that DOS users routing through Studio can choose Composer 2 for high-cost coding tasks and Kimi K2.5 for inner-loop calls, with the capability-difference between them visible to the routing logic. The unbundling at the model layer becomes routing optionality at the gateway layer.

Two. The Studio pitch deck — currently being prepared for the founding-customer conversations the W14 essay walked through — now leads with "we route across substrates including the labs, the IDEs-turned-labs, and the open-weights tier" rather than "we route across substrates." Naming Cursor as a substrate (not a competitor or an alternative product) inverts the conventional positioning and matches the actual market shape after Thursday.

That is the kind of forcing function the unbundling-everywhere week is structurally designed to produce. The coding-tool space, the consumer-agent space, the substrate-procurement space, and the user-research space all moved at once — and the indie founder who reads them as one signal positions correctly for the next quarter.

What I am watching for next week

The thread that runs through the week

Cursor became a model lab. OpenAI made smart agentic coding free on consumer. The DOD escalated Anthropic's procurement designation. Anthropic published the largest agent-conducted user study on record. Comet zeroed out its consumer pricing. Five days. Five vectors. One pattern: the layers underneath indie tooling are commoditizing in parallel, and indie founders need to pick which layer's moat they intend to compete on, deliberately, before the layer below them prices their product down.

For an indie founder, the playbook reduces. Build to the protocol boundary (MCP, AGENTS.md, Skills) so vendor-portability is real. Choose your moat layer deliberately — workflow, memory, conventions, evaluation, sovereignty-aware routing — and double down. Do not build a wrapper whose differentiation can be replicated in nine months by the IDE that hosted you.

That is what Studio is becoming. The unbundling-everywhere week made the architecture argument operational at four layers simultaneously, and the moat — the operator loop on top — is now the only piece I am betting won't be commoditized by the layer below.

— Lucas


Sources verified the week of Mar 16-20, 2026: Cursor Composer 2 launch (Mar 19) · 9to5Mac on GPT-5.4 mini & nano (Mar 17) · TechCrunch on DOD escalation (Mar 18) · Anthropic 81k-interviews study (Mar 18) · MacRumors on Comet free on iPhone (Mar 18)

Was this page helpful?

O arco de 27 semanas · Um corpo de trabalho

Vinte e sete semanas. Dois textos por semana. Seis meses de escrita durante a construção.

Semana

Ensaio de terça

Boletim de sexta