Topic

A cost-control inversion: instead of choosing a smart model for hard work, commit to the cheapest model and use the smart model once — to write the prompt that makes the cheap model perform like the expensive one.

Target Reader

Knowledge entrepreneurs and small operators feeling the squeeze as AI subscriptions tighten and API bills climb — who run the same tasks repeatedly and would love to slash per-run cost without losing quality.

The Fear / Frustration / Want / Aspiration

Fear: “AI is getting expensive and I can’t afford to run everything on the top model.” Want: a way to keep quality while paying cheap-model rates on the work they repeat.

Before State

They reach for the premium model whenever quality matters and eat the cost on every single run — even for repetitive tasks where the reasoning never changes.

After State

They pay the premium model once to author a cheap-model-optimized prompt, then run the volume on Haiku-tier inference — capturing most of the quality at a fraction of the cost.

Narrative Arc

The intuitive setup — smart model for hard, cheap model for easy — leaves money on the table. Flip it: fix the tier at the bottom and move the intelligence upstream into the prompt. You pay for reasoning once and reuse it forever.

Core Argument

Reasoning is a cost you can amortize: a top model that knows a cheap model’s limits can bake the strategy into instructions, so the cheap model executes a great plan instead of reasoning from scratch.

Key Evidence / Examples

  • “I’m gonna use the cheapest model available, but the smartest model available to prompt that cheap model.” — Lou, 2026-06-11
  • Reported 20–75% improvement in the cheap model’s performance; compounds with DSPy-style auto-optimization.
  • Distinct from dynamic downgrade (Nate Herk): there the model choice carries the intelligence; here the prompt does, tier fixed at the bottom on purpose.
  • The subsidy backdrop that makes cost salient: 43M tokens ≈ 20 plan — Insight - Model Altitude — Route Model and Effort by Workflow Step, Not by Whole Artifact

Proposed Structure (5–7 beats)

  1. The cost squeeze everyone’s feeling right now (open with the subsidy math).
  2. The intuitive (and wasteful) way to assign models.
  3. The inversion: fix the cheap tier, hire the smart model to write the prompt.
  4. Why it works — the smart model knows the cheap model’s constraints.
  5. “Pay once for reasoning, reuse it forever” — the amortization principle.
  6. The exact prompt to copy, plus where to stack DSPy.
  7. Close: this is how small operators stay in the game as prices rise.

Editorial Notes

Strongest of the three June 11 briefs — highest emotional charge (cost fear) plus a copy-paste action. Lead with the subsidy math to make the stakes visceral. Give readers the literal prompt. Keep the DSPy mention light — link, don’t teach it inline.

Next Step

  • Approved for drafting
  • Needs revision
  • Deprioritised