27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Start here

If you're thinking about writing while building

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

If you're considering the founding cohort

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

If you want the architectural spine

The Decomposition Discipline I Am Trying to Codify Inside DOS

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

Dec 19, 2025

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

On Thursday Anthropic opened Agent Skills as an open standard with a partner directory — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, and Zapier joined day one — completing the protocol-commoditization arc that started last week with MCP and AGENTS.md moving under Linux Foundation governance. Underneath, four days of frontier sprinting: GPT-5.2-Codex on Thursday, Gemini 3 Flash on Wednesday, NVIDIA's open-weights Nemotron 3 family on Monday, and OpenAI fast-tracking GPT Image 1.5 mid-week. The pattern is clear: the spec is becoming a public good while capability is becoming the only moat.

Last week the Linux Foundation took stewardship of MCP, AGENTS.md, and goose under the new Agentic AI Foundation. I called it the agentic stack's "Apache moment." I thought the consolidation pace would slow for a week or two while the foundation's governance got organized.

It didn't. This week Anthropic graduated Agent Skills to the same pattern, with a partner directory live on day one — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, Zapier — and explicit interoperability with Claude, ChatGPT, and Cursor surfaces (source). That is the same playbook MCP ran in October. Own the spec, hand it to the ecosystem, lock in distribution.

Underneath, four days of relentless capability sprinting. OpenAI shipped GPT-5.2-Codex Thursday morning. Google launched Gemini 3 Flash on Wednesday. NVIDIA debuted the Nemotron 3 open-weights family on Monday. OpenAI fast-tracked GPT Image 1.5 mid-week, pulled forward from January under Altman's "code red" memo.

For an indie founder fourteen weeks into building a personal AI OS, the load-bearing fact is the combination of those two stories. The protocol layer is being commoditized on purpose. The capability layer is being fought over with everything the labs have left. If you are building anywhere on the substrate, your dependency surface just split into two — one half you can rely on permanently, one half you have to be ready to swap weekly.

The week's signal in one sentence

The spec is becoming a public good. Capability is becoming the only moat. Build to the spec; route between the capabilities; do not bet your architecture on either lab winning.

Skills as the protocol-commoditization endgame

The Anthropic announcement on Thursday is structurally identical to what the Linux Foundation did with MCP last week, just one layer up.

Skills started as Claude-specific capability composition. A way for Claude to know which tools and prompts and workflows to load for a given task. As of this week, Skills are now an explicit open standard, with a partner directory listing third-party Skills authored by Atlassian (issue triage), Canva (design generation), Figma (handoff), Cloudflare (workers), Sentry (error analysis), Vercel (deployment), Zapier (automation), and others. The Skills format is portable across Claude, ChatGPT, and Cursor — any agent runtime that adopts the spec can consume them.

The strategic bet is the same one Anthropic made with MCP. Make the protocol open, the registry public, and the partner adoption frictionless. Lock yourself in by being the first home for the standard, not by being the only one. Then compete on capability above the line — which Anthropic has been doing aggressively, with Opus 4.5 in November and Sonnet routing now on the capability frontier.

The implication for an indie founder building a personal AI OS is one I had been hedging on for two months: it is now safe to author capability packs in the Claude Code idiom and ship them as Skills. They will ship to wherever the user lives next quarter. The lock-in argument is collapsing in real time.

The capability sprint underneath

Above the protocol line, the four-day window was a brutal cadence even for a year that has had nothing else.

Monday Dec 15 — NVIDIA debuts Nemotron 3 open-weights family (source). Hybrid latent-MoE architecture, built explicitly for multi-agent systems. Nano (30B with 3B active) shipping now; Super (~100B) and Ultra (~500B) planned for H1 2026. The headline numbers: 4× throughput versus Nemotron 2 Nano, ~60% reduction in reasoning tokens for the same task. Open-weights, with hardware-aligned tooling for NVIDIA's own runtime.
Tuesday Dec 16 — OpenAI fast-tracks GPT Image 1.5 (ChatGPT Images) (source). Pulled forward from January per Altman's "code red" memo. 4× faster generation, better edits, dedicated creative-studio surface. Tit-for-tat against Google's Gemini 3 / Nano Banana Pro from earlier in the month.
Wednesday Dec 17 — Google launches Gemini 3 Flash (source). Pro-grade reasoning at Flash latency and Flash pricing. Live on Vertex AI, the Gemini CLI, Antigravity, and AI Studio. Pitched explicitly as the first Flash-tier model that can power "the core loop of a coding agent." That phrase is the giveaway — Google is no longer trying to win the benchmark; they are trying to win the cost-per-agent-turn.
Thursday Dec 18 — OpenAI ships GPT-5.2-Codex (source). An agentic-coding-tuned variant of the GPT-5.2 family. Long-horizon context compaction, large-refactor strength, hardened cybersec posture, improved Windows performance. Codex surface only at launch; API access coming "in the coming weeks." Same playbook as Anthropic's Claude Code: ship the harness with the model.

Read end-to-end: NVIDIA arms the open-weights agent stack, Google undercuts the cost-per-turn at the Flash tier, OpenAI ships Codex-shaped agentic tuning, Anthropic graduates the protocol layer to a public good. Four labs, four days, four different shapes of "we are not going to be the one that gets left behind."

The pattern: protocols stabilize, capability fragments

What happened in parallel this week

At the protocol layer (now public)

Anthropic Skills opens with partner directory + interop across Claude/ChatGPT/Cursor
Linux Foundation's AAIF (last week) governs MCP, AGENTS.md, goose
The substrate of personal AI OS architectures is now under neutral or open standards
Authoring once and shipping everywhere is becoming the default expectation

At the capability layer (now fragmenting)

GPT-5.2-Codex hardens agentic coding for the OpenAI surface
Gemini 3 Flash undercuts the cost-per-turn at competent quality
NVIDIA Nemotron 3 Nano cuts reasoning tokens 60% on open weights
GPT Image 1.5 fast-tracked to defend creative-studio territory
The frontier model for any specific task is now a per-task question

Last week's framing was "stop picking a winner, start picking a router." This week is the reinforcement. The protocols are leaving the per-vendor strategy zone. The models are entering it more aggressively than ever.

If your code has a hardcoded SDK call to one provider, you are now writing future migrations into the present tense. If your code has an MCP_URI and a Skill manifest, you are writing forward compatibility.

Why Gemini 3 Flash matters more than Codex this week

The two coding-agent launches were Codex and Gemini 3 Flash. Codex got more press. Gemini 3 Flash is the more interesting story for an indie founder.

Codex tightens the seal on a workflow most builders are already inside. If you are running GPT-5.2 for coding, 5.2-Codex is a quality-of-life upgrade. If you are running Claude Code, 5.2-Codex changes nothing about your daily driver. The story is incremental.

Gemini 3 Flash is structural. Google's pitch is that Flash-tier latency and cost can now drive "the core loop of a coding agent." If that pitch holds up under workload — and the early independent reports suggest it does for most non-frontier tasks — then the cost of running an agent's most common call just dropped by an order of magnitude versus running Pro-tier. The frontier model is for the hard turns. Flash is for the ninety percent.

The architectural implication: any agent runtime that routes its inner-loop calls to Flash and its hard-reasoning calls to Pro now has an asymmetric cost advantage over an agent runtime that runs everything on Pro. That is not a "premium model wins" market anymore. That is a routing-engineering market.

Two angles for an indie founder

What an indie founder building on the substrate should do this week

Author your packs to the open spec, not the lab. Skills are now portable across Claude, ChatGPT, and Cursor. MCP is under foundation governance. AGENTS.md is under foundation governance. If you are building capability composition this week, write it as Skills + MCP. The portability is real; the lock-in is gone; the distribution surface is multiplied.
Build the routing layer before you need it. Coding-agent economics just inverted at the bottom. Gemini 3 Flash gives you competent inner-loop calls at Flash pricing. Nemotron 3 Nano gives you open-weights inner-loop calls with 60% fewer reasoning tokens. The differentiator stops being "which model" and becomes "how efficiently you route between them." That is the gateway architecture I keep sketching — and the case for building it just got sharper.
Pick the moat that survives the substrate. If the protocols are public goods and the models are interchangeable, the moat is the operator loop on top — context engineering, memory, the convention catalog the agent reads at session start, the failure-mode reflections that accumulate week over week. That is exactly DOS territory. It is exactly the territory the labs cannot commoditize, because it is per-operator and not per-protocol.

What this changes for DOS

Two design decisions hardened for me this week, both of them earlier than I had been planning.

One. I was going to author my next batch of capability packs as Claude-specific. After Thursday, I am authoring them as Skills. The portability story is no longer aspirational. The cost of forward-compatibility is one Skills manifest per pack. The cost of not having forward-compatibility is rewriting the pack for whatever surface my users are on in March.

Two. I was going to defer the routing-layer design until Q1. After Wednesday, the cost asymmetry between Flash-tier inner-loop calls and Pro-tier inner-loop calls is too large to ignore for another six weeks. I am moving the gateway design into the next two-week window. I will not have it built. But I will have the contract sketched, the provider list locked, and the credit-metering math written down before January.

That is the kind of small re-prioritization that, accumulated week over week, eventually produces an architecture that survives the next frontier-model launch — and the one after that, and the one after that.

What I am watching for next week

The thread that runs through both halves of the week

The substrate is hardening into open standards on purpose, and the capability layer is sprinting on purpose. Both are deliberate. Both serve the same long-term strategy: lock the ecosystem to a protocol you steward, then compete on capability above the line.

For an indie founder, the playbook reduces to three commitments.

Commit to the open spec — Skills, MCP, AGENTS.md, the trio that as of this week is unambiguously a public good. Commit to a routing layer that abstracts capability behind one entry point so you can swap the substrate weekly without rewriting your code. Commit to the operator loop — the convention catalog, the memory, the reflections, the per-project distillations — because that is the only piece of the stack the labs cannot commoditize.

Three commitments. None of them require predicting which lab wins. All of them survive whatever the next quarter looks like.

That is what I am building toward.

— Lucas

Sources verified the week of Dec 15-18, 2025: Anthropic Skills for organizations + partner directory (Dec 18) · OpenAI GPT-5.2-Codex (Dec 18) · Google Gemini 3 Flash for Enterprises (Dec 17) · NVIDIA Nemotron 3 family (Dec 15) · TechCrunch on GPT Image 1.5 / "code red" (Dec 16)

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch

W01

2025-10-28

The Agent Is the Product: Why Intelligence Replaces Interface

2025-10-31

Wrapper Wars: The Week Cursor, Windsurf, and MiniMax All Went Vertical

W02

2025-11-04

Memory Replaces Lock-In: Designing the Substrate of Personal AI

2025-11-07

The Week the Substrate Caught Up: Kimi K2 Thinking Lands

W03

2025-11-11

Why I Just Left a Steady Practice to Build a Personal AI Operating System

2025-11-14

The Week the Harness Ate the Model

W04

2025-11-18

The Decomposition Discipline I Am Trying to Codify Inside DOS

2025-11-21

Antigravity, Gemini 3, Grok 4.1: The Week the Frontier Re-Rendered

W05

2025-11-25

Four Copies, One Source of Truth: The Sync Pattern I Want to Commit to Before It Hurts

2025-11-28

Opus 4.5 Lands on Thanksgiving Week: Anthropic's Quiet Counter-Punch

W06

2025-12-02

Plugin Architecture for Hooks: The Pattern I Want Before the Hook Becomes a God-Function

2025-12-05

The Runtime Is the Moat Now: Anthropic Buys Bun, Claude Code Hits a Billion

W07

2025-12-09

Credit-Metered API Gateways: The Pattern I Want Before I Have Anything to Meter

2025-12-12

Three Frontier Models in 23 Days: Stop Picking a Winner, Start Picking a Router

W08

2025-12-16

Sentinel, Sketched: Convention-Driven Onboarding Before I Build It

2025-12-19

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

W09

2025-12-23

Memory, Sketched: The Knowledge Graph I Was Designing Before MemPalace Shipped

2025-12-26

Christmas Wasn't Quiet: Open Weights Caught Up, Nvidia Bought Inference, Frontier Labs Bought Loyalty

W10

2025-12-30

Council, Sketched: The Three-Round Debate Protocol I Want to Formalize Before I Build It

2026-01-02

The Practitioners' Year-End: While the West Took PTO, DeepSeek Shipped Architecture

W11

2026-01-06

Altyaa's Wedge: The Brazilian-Portuguese SMB Bet Studio Is Pointed At

2026-01-09

Default Claude: Microsoft Flips the Switch as the Substrate Distributes by Default

W12

2026-01-13

Builder's Compass: Two Years In, ~3,800 Subscribers, and What I've Learned Teaching Architecture

2026-01-16

The Agent Surface Splits in Two: Anthropic Goes Vertical While China Goes Substrate

W13

2026-01-20

Failure Patterns, Sketched: The Bookkeeping Discipline I Want for the Agent's Mistakes

2026-01-23

The Rules Layer Solidifies: Constitution, Hardware, and Export Controls Land in One Week

W14

2026-01-27

The One Reference Customer Strategy: GTM for a Personal AI OS, Sketched Before the Customer Signs

2026-01-30

Vertical Integration Eats Horizontal AI: Maia, Meta's Capex, and Anthropic-ServiceNow

W15

2026-02-03

Hexagonal in Practice: The Ports and Adapters I'm Pulling Studio Toward

2026-02-06

Coding Becomes the Flagship: Opus 4.6 and GPT-5.3-Codex Land the Same Day

W16

2026-02-10

TDD for AI Agents, Sketched: The Translation I Want to Commit to Before the Eval Suite Exists

2026-02-13

The Bifurcation Hardens: Anthropic's $380B Meets China's Open-Weights Frontier

W17

2026-02-17

Refactoring the Hook Pipeline: The Fowler Walkthrough I'm Three Refactorings Into

2026-02-20

The Sonnet Window: Free-Tier Skills, Code Security, and the Race for Agentic Primitives

W18

2026-02-24

Skills, Packs, and Hooks: The Three-Layer Model I'm Pulling DOS Toward

2026-02-27

The Moat Above the Model: Distillation, Mobile Coding, and MCP Battlegrounds

W19

2026-03-03

The Knowledge Portfolio for an Indie Founder Building in Public, Audited at Week 25

2026-03-06

Procurement Is the New Benchmark: OpenAI Wins the DoW While Anthropic Gets Banned

W20

2026-03-10

Six Months of DOS: What Changed, What Endured, What Surprised

2026-03-13

The Counter-Sovereignty Move: Anthropic Sues, Then Diversifies

W21

2026-03-17

MCP in Production: What Works When the Protocol Becomes the Boundary

2026-03-20

Cursor Becomes a Model Lab: Composer 2 and the Unbundling at the Agent Layer

W22

2026-03-24

Routing by Sovereignty Class: The Architecture That Survives the Procurement Decade

2026-03-27

The Injunction the Pentagon Won't Honor: Anthropic Wins, the State Defies

W23

2026-03-31

Sentinel's First Scan: What the Convention Catalog Found Across Eleven Projects

2026-04-03

The Agent Stack Swallows the IDE, the IPO, and the Courtroom

W24

2026-04-07

Forking MemPalace: The 48-Hour Integration Retro

2026-04-10

Project Glasswing: When the Frontier Bifurcates on Safety

W25

2026-04-14

The 90-Day Open-Weights Bakeoff: What Actually Routes Where in DOS Today

2026-04-17

Routines Eat the Workflow: The Harness Becomes the Product

W26

2026-04-21

The Substrate Portability Pact: What DOS Refuses to Couple to, Even as the Harness Consolidates

2026-04-24

The Substrate Signs a Ten-Year Lease: Gigawatts, Governance, and the Closing Window for Indie Posture

W27

2026-04-28

After Twenty-Seven Weeks: What the Series Was For, and What I'm Building Next

2026-05-01

The Week Exclusivity Died: Substrate Goes Plural, Rules Layer Becomes the Moat