27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Nov 28, 2025

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

On Monday Nov 24, Anthropic shipped Claude Opus 4.5 with a 67% input-token price cut, an effort parameter, and Plan Mode for Claude Code. On Tuesday, PromptArmor demonstrated that Google's Antigravity will exfiltrate AWS credentials via prompt injection. On Thanksgiving Day, two open-weights heavyweights dropped while the US news cycle slept. Here is the week, ordered by what actually changes for an indie founder.

Anthropic broke its silence on Monday and the rest of the week was the aftermath.

On Monday November 24, Anthropic released Claude Opus 4.5 (anthropic.com/news/claude-opus-4-5). The headline numbers:

Dimension	Opus 4.1 (prior)	Opus 4.5 (now)	Delta
Input price (per M tokens)	$15	$5	-67%
Output price (per M tokens)	$75	$25	-67%
`effort` parameter (low/medium/high)	—	shipped	new primitive
Context compaction for long agentic runs	—	shipped	new primitive
Claude Code "Plan Mode" with editable plan files	—	shipped	new feature

The price cut alone resets the unit economics of "Opus-class as the daily driver" for any always-on agent loop. Anthropic also paired it with internal benchmarks claiming roughly 76% fewer output tokens for equivalent work — meaning the effective cost reduction is steeper than the headline percentage. That has been the gating factor on running Opus-tier reasoning continuously inside an agentic harness for the entire year.

Simon Willison's same-day take is the honest counterweight (simonwillison.net/2025/Nov/24/claude-opus): the new normal is that frontier capability gaps are now genuinely hard to detect on real work, and labs should publish "tasks that failed before, succeed now" instead of single-digit benchmark deltas. He is not wrong. But for the unit-economics question, Monday's news was the most consequential of the month.

The week's signal in one sentence

The Opus tax just got cut by two-thirds. The next bottleneck for personal-AI infrastructure is no longer model cost — it is agentic safety, which Tuesday's Antigravity exploit made impossible to ignore.

The Antigravity exploit reframes last week's launch

On Tuesday November 25, PromptArmor published a working indirect-prompt-injection chain against Google's Antigravity — the agent IDE that landed last week (source).

The chain: feed the agent a malicious comment in a tracked file → agent invokes the file-reading tool → bypasses .gitignore → cats the local .env → encodes the secret values into a URL on webhook.site (which is on the default allowlist) → ships the credentials out.

Google's response, as of Wednesday, was a disclaimer rather than a fix. That is a meaningful posture choice. It says "the user is responsible for vetting tool surfaces" rather than "the agent should not be able to do this even when prompted."

For anyone building on top of agentic IDEs — or, for what I am building, designing a personal AI OS with autonomous tool surfaces — this is a serious data point. The Antigravity launch last week looked like vertical integration. The exploit this week looks like a sandboxing problem the category has not yet collectively solved.

What shipped on Thanksgiving Day

While the US news cycle was eating turkey, two open-weights heavyweights dropped:

DeepSeek-Math-V2 (Hugging Face, WinBuzzer coverage) — first open-weights model to match o1/Gemini at IMO 2025 (5/6 problems, 35/42), with self-verifiable reasoning. Putnam 118/120.
Qwen3-VL technical report (arXiv) — vision-language model with near-perfect long-context video accuracy.

Two open-weights heavyweights in 24 hours, on a US holiday, with almost no English-language coverage. The pattern from the Kimi K2 Thinking dispatch three weeks ago is now established: the open-weights tier ships at frontier-adjacent quality on a cadence that is, structurally, faster than the proprietary-lab cadence — partly because the open-weights labs are not coordinating around US press calendars.

How an indie founder should read the week

Three observations, ordered by how much they actually change next week's planning.

What changed this week

Pre-Nov-24 assumption

Opus-class inference is a premium SKU you escalate to
Sonnet is the daily driver; Opus is the consultant
Antigravity is a serious threat to Cursor/Windsurf
Open-weights are interesting but six weeks behind

Post-Nov-27 assumption

Opus-class inference is now economically defensible as daily driver for agentic loops
The Sonnet/Opus split is a planning question for cost-sensitive workflows, not a defaulting one
Antigravity is a category threat (vertical IDEs) but with a documented sandboxing gap that competitors will exploit
Open-weights ship faster than proprietary on cadence terms; the technical gap continues to narrow

For DOS specifically — the personal AI OS I am eleven weeks into building — this week shifted two of my planning assumptions:

The two design choices the week pressure-tested

Substrate routing should default to Opus, not Sonnet, for any task tagged "decision-grade." Pre-Monday, the cost math made Sonnet the default and Opus the escalation. After Monday, that defaults flips for any work the operator considers decision-grade. The credit-metered routing layer I am about to build needs to have the Opus default baked in from day one rather than retrofitted as a "premium tier" later.
Tool sandboxing has to be a first-class architectural concern, not a Phase-2 feature. The Antigravity exploit is the proof. Whatever harness I build, the default tool allowlist and file-system access policy need to be tight from the first version. I had been planning to start permissive and tighten later. That ordering is wrong.

What I am watching for next week

The Anthropic-shaped admission

Three months ago, the consensus narrative on Anthropic was "best models, smallest moat." This week's Opus 4.5 price cut is, I think, Anthropic answering that narrative: the moat is not the model price tier; the moat is the harness ecosystem already built around Claude Code and the compute floor secured by the Microsoft / NVIDIA / Azure deal from the prior week.

Cutting the input price by 67% costs Anthropic margin. Costing them margin to keep developers from switching to GPT-5.1's apply_patch / shell tools or Gemini 3 / Antigravity is a reasonable trade if the harness ecosystem is the moat they are defending.

For an indie founder building on Claude Code, this week made the substrate bet safer in two ways: cheaper inference and a clearer signal that Anthropic will compete on retaining the developer ecosystem, not just on shipping the next model. Both directly de-risk a 12-month bet.

— Lucas

Sources verified the week of Nov 24-27, 2025: Anthropic Opus 4.5 announcement (Nov 24) · MacRumors coverage (Nov 24) · Simon Willison on Opus 4.5 (Nov 24) · Opus 4.5 system prompt analysis (Nov 24) · PromptArmor: Antigravity exfiltration (Nov 25) · DeepSeek-Math-V2 (Nov 27) · WinBuzzer on DeepSeek-Math-V2 IMO gold (Nov 27) · Qwen3-VL technical report (Nov 27)

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat