Skip to content
DuranteDurante
ALL SYSTEMSGet Access

27 weeks · 54 posts · Written while building

Field notes from a personal AI OS in flight

Every Tuesday, an evergreen essay on what I'm learning while shipping DuranteOS. Every Friday, a dispatch from the week. Roughly 108,000 words and counting — for builders who'd rather watch the foundation get poured than read the press release.

Subscribe · Tuesday essay

Around 3,800 builders read this weekly.

The Spec Goes Public, the Substrate Goes to War: Skills Opens While Codex, Gemini Flash, and Nemotron Sprint

On Thursday Anthropic opened Agent Skills as an open standard with a partner directory — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, and Zapier joined day one — completing the protocol-commoditization arc that started last week with MCP and AGENTS.md moving under Linux Foundation governance. Underneath, four days of frontier sprinting: GPT-5.2-Codex on Thursday, Gemini 3 Flash on Wednesday, NVIDIA's open-weights Nemotron 3 family on Monday, and OpenAI fast-tracking GPT Image 1.5 mid-week. The pattern is clear: the spec is becoming a public good while capability is becoming the only moat.

Last week the Linux Foundation took stewardship of MCP, AGENTS.md, and goose under the new Agentic AI Foundation. I called it the agentic stack's "Apache moment." I thought the consolidation pace would slow for a week or two while the foundation's governance got organized.

It didn't. This week Anthropic graduated Agent Skills to the same pattern, with a partner directory live on day one — Atlassian, Canva, Figma, Cloudflare, Sentry, Vercel, Zapier — and explicit interoperability with Claude, ChatGPT, and Cursor surfaces (source). That is the same playbook MCP ran in October. Own the spec, hand it to the ecosystem, lock in distribution.

Underneath, four days of relentless capability sprinting. OpenAI shipped GPT-5.2-Codex Thursday morning. Google launched Gemini 3 Flash on Wednesday. NVIDIA debuted the Nemotron 3 open-weights family on Monday. OpenAI fast-tracked GPT Image 1.5 mid-week, pulled forward from January under Altman's "code red" memo.

For an indie founder fourteen weeks into building a personal AI OS, the load-bearing fact is the combination of those two stories. The protocol layer is being commoditized on purpose. The capability layer is being fought over with everything the labs have left. If you are building anywhere on the substrate, your dependency surface just split into two — one half you can rely on permanently, one half you have to be ready to swap weekly.

The week's signal in one sentence

The spec is becoming a public good. Capability is becoming the only moat. Build to the spec; route between the capabilities; do not bet your architecture on either lab winning.

Skills as the protocol-commoditization endgame

The Anthropic announcement on Thursday is structurally identical to what the Linux Foundation did with MCP last week, just one layer up.

Skills started as Claude-specific capability composition. A way for Claude to know which tools and prompts and workflows to load for a given task. As of this week, Skills are now an explicit open standard, with a partner directory listing third-party Skills authored by Atlassian (issue triage), Canva (design generation), Figma (handoff), Cloudflare (workers), Sentry (error analysis), Vercel (deployment), Zapier (automation), and others. The Skills format is portable across Claude, ChatGPT, and Cursor — any agent runtime that adopts the spec can consume them.

The strategic bet is the same one Anthropic made with MCP. Make the protocol open, the registry public, and the partner adoption frictionless. Lock yourself in by being the first home for the standard, not by being the only one. Then compete on capability above the line — which Anthropic has been doing aggressively, with Opus 4.5 in November and Sonnet routing now on the capability frontier.

The implication for an indie founder building a personal AI OS is one I had been hedging on for two months: it is now safe to author capability packs in the Claude Code idiom and ship them as Skills. They will ship to wherever the user lives next quarter. The lock-in argument is collapsing in real time.

The capability sprint underneath

Above the protocol line, the four-day window was a brutal cadence even for a year that has had nothing else.

  • Monday Dec 15 — NVIDIA debuts Nemotron 3 open-weights family (source). Hybrid latent-MoE architecture, built explicitly for multi-agent systems. Nano (30B with 3B active) shipping now; Super (~100B) and Ultra (~500B) planned for H1 2026. The headline numbers: 4× throughput versus Nemotron 2 Nano, ~60% reduction in reasoning tokens for the same task. Open-weights, with hardware-aligned tooling for NVIDIA's own runtime.

  • Tuesday Dec 16 — OpenAI fast-tracks GPT Image 1.5 (ChatGPT Images) (source). Pulled forward from January per Altman's "code red" memo. 4× faster generation, better edits, dedicated creative-studio surface. Tit-for-tat against Google's Gemini 3 / Nano Banana Pro from earlier in the month.

  • Wednesday Dec 17 — Google launches Gemini 3 Flash (source). Pro-grade reasoning at Flash latency and Flash pricing. Live on Vertex AI, the Gemini CLI, Antigravity, and AI Studio. Pitched explicitly as the first Flash-tier model that can power "the core loop of a coding agent." That phrase is the giveaway — Google is no longer trying to win the benchmark; they are trying to win the cost-per-agent-turn.

  • Thursday Dec 18 — OpenAI ships GPT-5.2-Codex (source). An agentic-coding-tuned variant of the GPT-5.2 family. Long-horizon context compaction, large-refactor strength, hardened cybersec posture, improved Windows performance. Codex surface only at launch; API access coming "in the coming weeks." Same playbook as Anthropic's Claude Code: ship the harness with the model.

Read end-to-end: NVIDIA arms the open-weights agent stack, Google undercuts the cost-per-turn at the Flash tier, OpenAI ships Codex-shaped agentic tuning, Anthropic graduates the protocol layer to a public good. Four labs, four days, four different shapes of "we are not going to be the one that gets left behind."

The pattern: protocols stabilize, capability fragments

What happened in parallel this week

At the protocol layer (now public)
  • Anthropic Skills opens with partner directory + interop across Claude/ChatGPT/Cursor
  • Linux Foundation's AAIF (last week) governs MCP, AGENTS.md, goose
  • The substrate of personal AI OS architectures is now under neutral or open standards
  • Authoring once and shipping everywhere is becoming the default expectation
At the capability layer (now fragmenting)
  • GPT-5.2-Codex hardens agentic coding for the OpenAI surface
  • Gemini 3 Flash undercuts the cost-per-turn at competent quality
  • NVIDIA Nemotron 3 Nano cuts reasoning tokens 60% on open weights
  • GPT Image 1.5 fast-tracked to defend creative-studio territory
  • The frontier model for any specific task is now a per-task question

Last week's framing was "stop picking a winner, start picking a router." This week is the reinforcement. The protocols are leaving the per-vendor strategy zone. The models are entering it more aggressively than ever.

If your code has a hardcoded SDK call to one provider, you are now writing future migrations into the present tense. If your code has an MCP_URI and a Skill manifest, you are writing forward compatibility.

Why Gemini 3 Flash matters more than Codex this week

The two coding-agent launches were Codex and Gemini 3 Flash. Codex got more press. Gemini 3 Flash is the more interesting story for an indie founder.

Codex tightens the seal on a workflow most builders are already inside. If you are running GPT-5.2 for coding, 5.2-Codex is a quality-of-life upgrade. If you are running Claude Code, 5.2-Codex changes nothing about your daily driver. The story is incremental.

Gemini 3 Flash is structural. Google's pitch is that Flash-tier latency and cost can now drive "the core loop of a coding agent." If that pitch holds up under workload — and the early independent reports suggest it does for most non-frontier tasks — then the cost of running an agent's most common call just dropped by an order of magnitude versus running Pro-tier. The frontier model is for the hard turns. Flash is for the ninety percent.

The architectural implication: any agent runtime that routes its inner-loop calls to Flash and its hard-reasoning calls to Pro now has an asymmetric cost advantage over an agent runtime that runs everything on Pro. That is not a "premium model wins" market anymore. That is a routing-engineering market.

Two angles for an indie founder

What an indie founder building on the substrate should do this week

  1. Author your packs to the open spec, not the lab. Skills are now portable across Claude, ChatGPT, and Cursor. MCP is under foundation governance. AGENTS.md is under foundation governance. If you are building capability composition this week, write it as Skills + MCP. The portability is real; the lock-in is gone; the distribution surface is multiplied.
  2. Build the routing layer before you need it. Coding-agent economics just inverted at the bottom. Gemini 3 Flash gives you competent inner-loop calls at Flash pricing. Nemotron 3 Nano gives you open-weights inner-loop calls with 60% fewer reasoning tokens. The differentiator stops being "which model" and becomes "how efficiently you route between them." That is the gateway architecture I keep sketching — and the case for building it just got sharper.
  3. Pick the moat that survives the substrate. If the protocols are public goods and the models are interchangeable, the moat is the operator loop on top — context engineering, memory, the convention catalog the agent reads at session start, the failure-mode reflections that accumulate week over week. That is exactly DOS territory. It is exactly the territory the labs cannot commoditize, because it is per-operator and not per-protocol.

What this changes for DOS

Two design decisions hardened for me this week, both of them earlier than I had been planning.

One. I was going to author my next batch of capability packs as Claude-specific. After Thursday, I am authoring them as Skills. The portability story is no longer aspirational. The cost of forward-compatibility is one Skills manifest per pack. The cost of not having forward-compatibility is rewriting the pack for whatever surface my users are on in March.

Two. I was going to defer the routing-layer design until Q1. After Wednesday, the cost asymmetry between Flash-tier inner-loop calls and Pro-tier inner-loop calls is too large to ignore for another six weeks. I am moving the gateway design into the next two-week window. I will not have it built. But I will have the contract sketched, the provider list locked, and the credit-metering math written down before January.

That is the kind of small re-prioritization that, accumulated week over week, eventually produces an architecture that survives the next frontier-model launch — and the one after that, and the one after that.

What I am watching for next week

The thread that runs through both halves of the week

The substrate is hardening into open standards on purpose, and the capability layer is sprinting on purpose. Both are deliberate. Both serve the same long-term strategy: lock the ecosystem to a protocol you steward, then compete on capability above the line.

For an indie founder, the playbook reduces to three commitments.

Commit to the open spec — Skills, MCP, AGENTS.md, the trio that as of this week is unambiguously a public good. Commit to a routing layer that abstracts capability behind one entry point so you can swap the substrate weekly without rewriting your code. Commit to the operator loop — the convention catalog, the memory, the reflections, the per-project distillations — because that is the only piece of the stack the labs cannot commoditize.

Three commitments. None of them require predicting which lab wins. All of them survive whatever the next quarter looks like.

That is what I am building toward.

— Lucas


Sources verified the week of Dec 15-18, 2025: Anthropic Skills for organizations + partner directory (Dec 18) · OpenAI GPT-5.2-Codex (Dec 18) · Google Gemini 3 Flash for Enterprises (Dec 17) · NVIDIA Nemotron 3 family (Dec 15) · TechCrunch on GPT Image 1.5 / "code red" (Dec 16)

Was this page helpful?

The 27-week arc · A single body of work

Twenty-seven weeks. Two posts a week. Six months of writing while building.

Week

Tuesday evergreen

Friday dispatch