27 semanas · 54 textos · Escritos durante a construção

Notas de campo de um SO de IA pessoal em voo

Toda terça-feira, um ensaio perene sobre o que aprendi enquanto envio o DuranteOS. Toda sexta, um boletim da semana. Cerca de 108 mil palavras e contando — para construtores que preferem ver a fundação ser lançada a ler o press release.

Comece por aqui

Se você está pensando em escrever enquanto constrói

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

Se você está considerando o grupo fundador

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

Se você quer a espinha arquitetônica

The Decomposition Discipline I Am Trying to Codify Inside DOS

Assinar · Ensaio de terça

Cerca de 3.800 construtores leem isso toda semana.

Nov 21, 2025

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

In four days, the agentic-coding stack acquired two new harnesses (Antigravity, Grok Agent Tools) and one frontier-tier challenger (Gemini 3 Pro at half Sonnet's price). Anthropic answered not with a model but with $30B of Azure compute. The model layer is converging on parity. The differentiator is moving — visibly, this week — to the harness, the tool surface, and the compute contract.

If you blinked between Monday and Thursday this week, you missed an entire competitive re-rendering of the AI substrate.

On Monday Nov 17, xAI shipped Grok 4.1, taking #1 on LMArena Text Arena (1483 Elo) and beating its predecessor 64.78% in blind preference tests (xAI announcement). On Tuesday Nov 18, Google shipped Gemini 3 Pro simultaneously into Search, the Gemini app, AI Studio, Vertex AI, the Gemini CLI, and a new agent IDE called Antigravity — Google's first true "everywhere on day one" model launch (Google blog, Antigravity launch). Same day, Microsoft, NVIDIA, and Anthropic announced a strategic partnership in which Anthropic commits $30B of Azure spend, NVIDIA invests up to $10B, and Microsoft up to $5B — locking Claude as the only frontier model on AWS + Azure + GCP (Microsoft announcement). On Wednesday Nov 19, xAI followed with Grok 4.1 Fast plus an Agent Tools API — 2M-token context, server-side tools (web, X search, code exec, doc retrieval), aggressive pricing (xAI Fast announcement).

Four days. Two new agent harnesses. One frontier-tier challenger at half Sonnet's price. One $45B compute deal. The "wait for Anthropic to ship Opus 4.5" story everyone was tracking became, in retrospect, the least interesting story of the week.

The week's signal in one sentence

The model layer is converging on parity. The differentiator is moving — visibly, this week — to the harness layer, the tool surface, and the compute contract.

Antigravity is the warning shot

Of everything that happened this week, Google's Antigravity matters most for anyone building on top of frontier models.

Antigravity is a fork of VS Code optimized for agent-first workflows. What makes it strategically interesting is not the IDE itself — Cursor and Windsurf already exist — but that it ships with day-one support for Claude Sonnet 4.5 and OpenAI models alongside Gemini 3 Pro. Google is publicly saying: we will compete on the agent runtime, not on locking you to our model.

That is a structural admission identical to the one OpenAI made last week with GPT-5.1's apply_patch / shell tools and the named ecosystem co-developers. Two of the three frontier labs have now publicly admitted, in product launches, that the harness is the surface area they are competing on.

Lab	Move this week	Strategic admission
Google	Antigravity ships with Claude + OpenAI as peer models	"The harness is more important than the model exclusivity"
xAI	Grok 4.1 Fast + Agent Tools API at 2M context	"Long-horizon agents need server-side tools and we will price aggressively to win them"
Anthropic	$30B Azure deal, no new model	"We will compete on guaranteed compute and tri-cloud distribution, not weekly model drops"
OpenAI	(no shipped news in window — still riding GPT-5.1 launch from prior week)	(silent, by inference)

Antigravity is the warning shot specifically because Google had every incentive to make it Gemini-only. They chose multi-model. The reading is simple: even Google does not believe the model is the moat.

Gemini 3 Pro lands cheaper than Sonnet, matching it on coding

The numbers Google shipped Tuesday:

#1 on LMArena at 1501 Elo (above Grok 4.1's Monday score)
76.2% on SWE-bench Verified (matching Sonnet 4.5)
$2/$12 per million tokens in/out under 200k context — undercutting Sonnet meaningfully

Simon Willison's same-day review is the most useful senior-engineer take I have read on it: "Gemini 2.5 upgraded to match the leading rival models." He transcribed a 3.5-hour council meeting for $1.42 — but with timestamp drift across the long context. A calibration shot for indie founders breathlessly retweeting benchmarks. The model is good. It is not magic.

For an indie founder building a personal AI OS, the price point matters more than the benchmark. Sonnet 4.5 has been the default substrate for serious agentic work since September. Gemini 3 Pro just made the pricing math real for "what if I run my agentic loop on Gemini for the parts where Claude's strengths are not load-bearing?"

That question did not have a serious answer last week. It does this week.

Anthropic's quiet move was the most strategic

Anthropic shipped no new model in the Nov 17-20 window. Opus 4.5 did not land until Nov 24, four days outside the window. What Anthropic shipped instead was the $30B Azure / Microsoft / NVIDIA partnership.

That is not a model launch. It is a distribution and compute lock-in play that compounds for the next 24+ months.

What the Microsoft + NVIDIA + Anthropic deal actually does

Tri-cloud availability. Claude is now the only frontier model available on AWS, Azure, and GCP simultaneously. Enterprise procurement teams that were forced to pick one cloud just lost that constraint. That alone is worth multi-billions in expansion revenue.
Compute floor through 2027. $30B of Azure spend + 1GW compute commitment + $10B from NVIDIA + $5B from MSFT means Anthropic has guaranteed inference capacity through the GPT-6 era. No cloud-driven supply crunch can starve them.
De-risks the Claude bet. For an indie founder building on Claude Code (which is where I sit), this is the most important news of the week. The substrate I am committing to has just secured itself for the duration of my 12-month runway and beyond. I do not have to factor "what if Anthropic loses cloud capacity in mid-2026" into my planning anymore.

The deal will not generate headlines for another six months. It is, structurally, the more important news of the week.

What I am watching for the next week

What this changes for DOS

For the personal AI OS I am ten weeks into building, the week confirmed three planning assumptions:

Substrate-portability is now a forced architectural choice, not a hopeful one. Antigravity's multi-model default is the same admission GPT-5.1's named co-developers made last week. Build the harness; let the model float underneath.
The Anthropic bet is now safer than it was on Monday. The Azure deal is the most underrated news of the week. For DOS specifically — built on Claude Code — the substrate just secured itself through 2027.
Pricing pressure on inference is real and accelerating. Gemini 3 Pro at half Sonnet's price for matching coding capability means the credit-metered gateway architecture I am about to build (routing inference across providers based on cost + capability) is going to matter sooner than I thought.

That last one shifts a build priority. The gateway was on my Q1 2026 roadmap. After this week I am moving it forward.

— Lucas

Sources verified the week of Nov 17-20, 2025: xAI Grok 4.1 launch (Nov 17) · Google Gemini 3 launch (Nov 18) · Antigravity announcement (Nov 18) · CNBC on Gemini 3 (Nov 18) · Microsoft + NVIDIA + Anthropic partnership (Nov 18) · Simon Willison on Gemini 3 (Nov 18) · xAI Grok 4.1 Fast + Agent Tools (Nov 19)

Was this page helpful?

O arco de 27 semanas · Um corpo de trabalho

Vinte e sete semanas. Dois textos por semana. Seis meses de escrita durante a construção.

Semana

Ensaio de terça

Boletim de sexta

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat