Insight: Prompt the Cheap Model With the Smart Model — Pay Once for Reasoning, Reuse It Forever

“I’m gonna use the cheapest model available, but I’m gonna use the smartest model available to prompt that cheap model so that it performs as well as it can.” — Lou

Session context: 2026-06-11_Mastermind — Lou shared an idea he’d read that morning and unpacked it three times for the group because the inversion is easy to miss.

Core Idea

The instinct is to use a smart model when the work is hard and a cheap model when it’s easy. This flips that. You commit upfront to running the task on the cheap model — say Haiku at high effort — and then you hire the smart model for one job only: write the prompt that lets the cheap model perform like the expensive one.

The reasoning matters. A model like Opus is intimately familiar with the capabilities and constraints of Haiku. So it can write instructions that compensate — telling the cheap model exactly what strategy to use, when to think step-by-step, what to watch for, how much scaffolding it needs. You’re not asking Haiku to reason like Opus on its own. You’re letting Opus pre-compute the reasoning and bake it into the instructions, so Haiku just executes a very detailed plan. Pay once for the intelligence; reuse it on every cheap inference after.

This is distinct from dynamic model routing (where the system decides in real time which tier a step needs). Here the tier is fixed at the bottom on purpose, and the smart model’s role is moved upstream into prompt construction. Reported gains run 20–75% improvement in the cheap model’s performance, and the technique pairs naturally with DSPy-style automatic prompt optimization for another compounding lift.

The deeper principle: reasoning is a cost you can amortize. Most workflows pay the premium-model tax on every single call. If the reasoning is stable, you can spend it once — in the prompt — and then run the volume on the cheapest model that can follow a good plan.

Practical Application

Pick a repetitive task you currently run on a premium model. Then hand the premium model this job instead:

“I’m going to run this task on [Haiku 4.5 / your cheapest viable model]. Knowing that model’s specific capabilities and limits, write a prompt that gets it to perform this task as well as you would — include the strategy, when to think, and anything it’s likely to get wrong without being told.”

Test the cheap model with that generated prompt against your old premium-model output. Where it holds up, you’ve just cut the per-run cost of that task by an order of magnitude. Especially powerful for forked/spawned sub-steps where you already specify the child’s model — optimize the prompt for that tier and you stack the savings.

Insight - The Model Underneath Is the Multiplier, Not the Interface — the model still matters; this just relocates the smart model’s contribution into the prompt.
Insight - Code Is for Computation, Inference Is for Judgment — same instinct (deploy expensive resources only where they earn their cost), applied to inference tiers.
Insight - Model Altitude — Route Model and Effort by Workflow Step, Not by Whole Artifact — routing decides the tier per step; this technique then squeezes the chosen tier harder.
Insight - The DSPy Cognitive Mirror — Teaching AI to Replicate Your Decision-Making — DSPy auto-optimizes prompts; combining it with cross-tier prompting was reported to produce “breathtaking” results.

Evolution Across Sessions

A counter-move to the assumption baked into Insight - The Model Underneath Is the Multiplier, Not the Interface (2025-08-07) — that you reach for the strongest model when the work is hard. Here the work stays on the cheapest model, and the strongest model is repurposed as a one-time prompt author. Establishes a baseline for “amortized reasoning” as a cost-control pattern; future sessions should test where Sonnet finally beats Haiku-with-a-great-prompt for a given task.

PowerUp Coaching — Living Knowledge Base

Explorer

Insight: Prompt the Cheap Model With the Smart Model — Pay Once for Reasoning, Reuse It Forever

Core Idea

Practical Application

Evolution Across Sessions

Graph View

Table of Contents

Backlinks

PowerUp Coaching — Living Knowledge Base

Explorer

Insight: Prompt the Cheap Model With the Smart Model — Pay Once for Reasoning, Reuse It Forever

Core Idea

Practical Application

Related Insights

Evolution Across Sessions

Graph View

Table of Contents

Backlinks