PowerUp AI Mastermind — June 11, 2026
“I’m gonna use the cheapest model available, but I’m gonna use the smartest model available to prompt that cheap model so that it performs as well as it can.” — Lou
This Week in 30 Seconds
- The ambient-intelligence culmination — months of scattered pieces (context management, token use, model/effort optimization, skill sharing) snapped together this week into one architecture. Lou walked through it as a 25-minute narrated deck, then unpacked the load-bearing ideas live.
- Fork vs Spawn — both isolate context and return only a result; the difference is whether the child inherits what the parent decided. Fork for continuity, spawn for an uncontaminated read.
- Model altitude — stop asking “which model for this article?” Ask it per step: research, angle, draft, copy-edit, and fact-check each need different intelligence. Record a rationale so routing is auditable.
- Prompt the cheap model with the smart model — commit to Haiku, then have Opus write the prompt that makes Haiku perform like Opus. Pay once for the reasoning; reuse it. Reported 20–75% lifts, especially paired with DSPy.
- Plugins beat manifests and symlinks — Lou’s homegrown manifest and symlink approaches don’t integrate with native
/invocation. A marketplace plugin repo is the supported path — version-pinnable, and the clean way to share skills with clients. Targeted for Monday. - Capability-Architect — a compiler, not a “make me an agent” prompt: intake → DAG → classify → bundle → execute → route → contract → generate → eval → install.
- The deck itself was the demo — generated from three days of chat exports, scripted and slided by Claude, voiced by ElevenLabs. The forked writing team produced a comprehensive article while keeping ~350K tokens of intermediate work out of the main context.
- Token economics — Joanna’s Uber-pricing question opened the subsidy conversation: Lou ran 43M tokens last month on a 8,000 of API compute, still subsidized.
- The closing turned heavy — Joanna raised the AI-displacement fear; Lou gave an unguarded read on billionaire concentration of power, uncompensated training data, and where small operators stand. The group landed on community and leverage as the answer.
1 — Gears + Hub: The À La Carte Offer
Lou re-opened the Gears/Hub source-code offer after last call’s interest didn’t convert. Rather than the single 695** (“I want the source code out in the world”), a **2,500 — or the full bundle at 1,300 available). The page: coachlou.com/fix-that-roof.
The honest frame: the prices assume hundreds of buyers, not a handful, but Lou has no overhead and a literal $15,000 roof to cover. He’d like a cohort of 4–5 for the coaching track and would start next week if interest holds.
2 — The Ambient-Intelligence Culmination
This was the spine of the session. For weeks Lou has been working separate problems — context management, token cost, model/effort optimization, sharing skills across projects without duplication. This week they fused: “each one of them helped figure the other thing out.”
The originating pain: many projects, all needing the same skills. He didn’t want skills duplicated into every project, didn’t want all 50–60 skills’ front matter loaded into global context (he measured ~45–50K tokens of MCP-tools-plus-skills overhead paid on every query, even “hello”), and didn’t want to hand-maintain local copies. The destination is an architecture where intelligence lives in the environment and a project inherits only what it declares.
He presented it as a narrated 25-minute deck — “from chat to system” — built for three audiences: members new to the language, builders who care about where context lives, and operators asking “where in my business am I still manually carrying context the environment could carry for me?” The next four sections are the load-bearing pieces.
3 — Fork vs Spawn: Should the Child Inherit What the Parent Knows?
The session’s sharpest distinction. Fork and spawn both run work in an isolated context and return only the result — neither pollutes the main conversation. The difference is the starting state.
A fork inherits the parent’s context — useful when the child should know what’s already been decided (the parent picked the article’s angle; the drafter should honor it). A spawn starts cold — useful when inherited context would contaminate the work (an adversarial reviewer shouldn’t inherit the parent’s confidence in the current direction; you want a colder read). The whole decision reduces to one question: would the child do better seeing what the parent decided? Yes → fork. No → spawn.
💡 What This Means for You
Before delegating any step, ask out loud: “Would this step do better knowing what I’ve already decided, or would that bias it?” Then tell the child to return only its conclusion or the artifact path — nothing else.
Deep Dive: Insight - Fork vs Spawn — Decide Whether the Child Should Inherit What the Parent Knows
4 — Model Altitude: Route by Step, Not by Artifact
Lou’s correction to his own starting question. “Which model should I use?” is asked at the wrong altitude. The final artifact is one thing, but the process that builds it is many kinds of work — research, angle selection, drafting, copy-editing, fact-checking — each with different needs. Four questions decide each step: does it need inference at all (or is it just code)? what’s the consequence if it’s wrong? does it need grounding? will a cheaper model retry so much it gets expensive?
The router emits a small record per step — component, step-class, model, effort, and a rationale. The rationale is what makes a bad output debuggable: was the classification wrong, the model wrong, the effort wrong, or did the rationale miss the real risk? The routing logic lives once in a shared model-effort-routing.md so every workflow reuses it. “This is modularity applied to judgment.” The standing rule: assign the least excessive inference that still clears the bar — never default everything to the strongest model at high effort.
💡 What This Means for You
Don’t pick one model for the whole job. Score each step on consequence and grounding, then assign the cheapest model + effort that reliably clears the bar — and write down why.
Deep Dive: Insight - Model Altitude — Route Model and Effort by Workflow Step, Not by Whole Artifact
5 — Capability-Architect: A Compiler for Capabilities
Once a folder can inherit and activate capabilities, how do you create them repeatably? Lou’s answer is a compiler, not a “make me an agent” prompt. Capability-Architect takes a workflow, problem, or existing pipeline and walks a fixed compile path: intake (state the job) → DAG (artifact dependencies) → classification (code / judgment / hybrid) → bundling (which steps share internal outputs) → execution (inline / fork / spawn / parallel) → routing (model + effort, delegated to the shared inference-router) → contract (what each child returns — usually a JSON envelope plus an artifact path) → generation → evaluation (when it fires and when it doesn’t) → install (make it inheritable).
The output isn’t a prompt — it’s an inheritable skill bundle that drops into the ambient library. And because judgment is modular (routing lives in a shared reference, not inside the architect), a writing orchestrator, a course-design workflow, and a client-delivery agent all reuse the same routing process.
Deep Dive: Insight - Capability-Architect — Compile a Workflow Into an Inheritable Skill Bundle
6 — Plugins: How Skills Actually Get Shared
Lou narrated the dead ends so members don’t repeat them. A manifest (a JSON list of needed skills, with CLAUDE.md told to load their front matter) works but doesn’t surface under native / invocation — you have to call the skill as a file. Symlinks fail because Claude doesn’t reliably follow them. What works is plugins from a marketplace: a version-controlled Git repo with a marketplace manifest, publishing either one plugin per skill or bundles (a “writing team” plugin holding 5–6 skills). A project declares which plugins to install; a tightly-scoped plugin appears to load only front matter, keeping context cheap.
Two bonuses: version-pinning (if an upstream skill update breaks your packaging, specify the previous version) and clean client distribution (hand them a marketplace link). Open thread: making plugin skills appear in the / menu like native ones. Targeted for Monday.
Gotcha worth its own note: the skill-creator tool tends to fill the 1024-character description limit — 20 skills at ~1,000 chars is ~20K of wasted context. When generating a skill, tell it to keep the description tight: a couple of trigger keywords, a one-to-two-line description.
💡 What This Means for You
Stop copying skills between projects and stop reaching for symlinks. One version-controlled repo, a marketplace manifest, plugins declared per project. And cap your skill descriptions — the front matter is a context tax you pay on every query.
Deep Dive: Insight - Plugins Are How You Share Skills — Version-Controlled Capabilities From a Marketplace Repo
7 — The Deck Was the Demo: Forked Writing Team in Action
The 25-minute presentation was itself an artifact of the architecture. Lou fed Claude three days of chat exports, said “pull out the main ideas, fill in how they work and what problem they solve, make it a cohesive presentation,” and got a second-draft script (the first read too much like reading slides). ElevenLabs voiced it; Claude built the script and slides.
The forked writing team that produced a sample article showed the payoff concretely: an orchestrator with scan / architect / draft / review / polish stages, each forked, each returning only a summary plus an artifact path. Drafts moved by file path — never pasted into the parent — so 68K, 63K, 61K, 56K of intermediate work stayed out of the main conversation. Only the final draft entered context. When Lou disliked a result he just re-ran his last query and the whole pipeline re-executed cleanly, no scrolling back through stale state. (He accidentally proved it by losing the on-screen output mid-demo — the artifacts were safe on disk.)
This is the practical case for Insight - Use the LLM as the UI — Conversation as Interface for Internal Tools: no TypeScript, no UI scaffolding, “draw me the dashboard” on demand — and portable to a server later if you want to wrap an API around it.
8 — Prompt the Cheap Model With the Smart Model
An idea Lou read that morning and unpacked three times because the inversion is easy to miss. Instead of using a smart model for hard work and a cheap one for easy work, you commit to the cheap model (say Haiku at high effort) and hire the smart model for one job: write the prompt that lets the cheap model perform like the expensive one. Opus knows Haiku’s capabilities and limits intimately, so it bakes the reasoning, strategy, and “think here” cues into the instructions. Pay once for the intelligence; reuse it on every cheap inference after. Reported 20–75% gains, and it pairs with DSPy-style auto-optimization for a further lift.
Kasimir noted Nate Herk’s related approach — dynamically downgrading to a lower tier when the task doesn’t need the high one. Lou distinguished the two: routing puts intelligence in the model choice; this puts it in the prompt, with the tier fixed at the bottom on purpose. Scott and Lou agreed the open question is where Sonnet finally beats Haiku-with-a-great-prompt — and that effort level (low/medium/high) is a third dial on top of model choice.
💡 What This Means for You
Pick a task you run on a premium model. Tell that model: “I’ll run this on Haiku — knowing its limits, write a prompt that gets it to perform as well as you would.” Test it against your old output. Where it holds, you’ve cut the per-run cost by an order of magnitude.
Deep Dive: Insight - Prompt the Cheap Model With the Smart Model — Pay Once for Reasoning, Reuse It Forever
9 — Token Economics: Who’s Actually Paying
Joanna asked whether AI pricing follows the Uber playbook — lowball to capture the market, then raise prices once you have it. Lou: largely yes — give it away free to capture eyeballs (subsidy as marketing spend), convert the rabid fans to 20 plan; at API rates that’s roughly 300K, and the racks and rows are full.
Lou’s own optimization: 20 on Codex, alternating between them for different parts of the process (knowledge and strategy work on Claude; coding on either, often Codex for a different perspective). He’s also figured out how to stack a second Claude subscription to get 60 of capacity rather than jumping straight from 100. He’s been reaching for Haiku a lot lately for day-to-day computer tasks — “remarkable how efficient it is.”
10 — Don Back’s Production Workflow + Voice Cloning
Don Back has been producing university lecture material with Claude — slides, scripts, lab workbooks generated from the previous lecture’s transcript plus his objectives. “It generates the workbook, generates the slides that illustrate it. Just does a really good job.” (His one calibration note from chat: Claude is “a little optimistic about what participants can achieve in a given time — I have to calibrate its professorial enthusiasm.”)
Lou’s demo prompted Don’s realization: clone his voice in ElevenLabs ($22/month tier — the one that lets you upload an hour of audio for nuance, not the 2-minute instant clone) and let it narrate his explainer videos. The two compared production workflows: Lou records narration live in ScreenFlow; Don switched to Final Cut Pro, dropping exported Keynote slides as images and scrubbing (not playing) to mark slide changes — “about 3 hours out of my workflow.” He wrote the Final Cut SOP by asking Claude. Lou’s note on full automation: it’d take FFmpeg (image → video overlaid on the audio track) since neither editor exposes an API — or a tool like Pictory that syncs slides/b-roll to audio.
11 — Scott’s me.md: Staying Portable Across Vendors
Scott shared a portability pattern he saw online: rather than putting orchestrator content in CLAUDE.md, write everything in a me.md and have CLAUDE.md simply point to it — so any future harness (Claude, Codex, Gemini) can be aimed at the same me.md and its sub-files. A second-brain approach designed to survive a platform move. Lou strongly agreed — it’s the same reasoning behind keeping the global skills/agents repo as a neutral, vendor-independent implementation that the harness resolves against. (See Insight - Platform as Interface, Not Custodian — The Resolver Pattern for Portable AI Intelligence.)
12 — Don’s Chat-Export Problem (and the Backend Fix)
Don raised a friction everyone hits: long chats won’t copy-paste cleanly. ChatGPT (and Claude) “chapter” the display into windows, so selecting “everything” silently saves empty pages. He found commercial Chrome extensions that solve it by going through the backend — you share the chat, email yourself the link, open it in Chrome, and the extension pulls the full conversation rather than scraping the front-end display. One variant outputs a YAML front-matter file you can drop straight into Obsidian and index. (Caveat: artifacts don’t export — an open problem.)
The deeper point landed with the group: keeping you trapped inside the chat is part of the “attention-grabbing, token-consuming business model.” Lou’s adjacent idea: take his existing ChatGPT/Claude data exports (JSON → Markdown via a small Python script that splits each conversation into its own file), then mine two years of conversations into Obsidian — not dumped in raw, but with insights, processes, and takeaways extracted. Don’s framing: “I don’t want to keep it on someone else’s property.” (This is Insight - Your AI Conversation History Is a Knowledge Asset Worth Mining in practice.)
13 — Scott’s Simple Win: Find the Slide
Scott’s reminder that the small uses matter. At a high-performance-computing conference, three dozen talks’ slide decks and posters landed in one SharePoint folder. He needed one slide from one panel talk — couldn’t remember the day, and the speaker (James) wasn’t on the title slide. He downloaded everything and asked: find the slide where this speaker appears and where “DANT” sits side by side. “Cold. That slide deck, slide 17.” A year ago he’d have flipped through decks for an hour. “Really simple uses of the tools for things that are not programmatic.”
14 — Joanna: From Manual Stagnation to Systems
Joanna named where she is honestly: everything in her life and coaching business is manual, and “that’s the stagnation.” Five years of coaching without systems. She’s mid-shift — digitizing physical receipts for taxes (first time), planning to test it with Claude Code — and it’s “torturously slow” because urgent, low-ROI tasks keep piling on. But she sees the direction, energized by Scott’s slide example: a year ago he wouldn’t have thought to try it. “I am going to systemize my life and business.”
15 — The Heavy Close: AI, Power, and Who Pays
Joanna surfaced the fear under the optimism — quoting a departed Anthropic researcher and Steven Bartlett’s Diary of a CEO, where insiders reportedly say the opposite in private of what they say in public about AI.
Lou gave his most unguarded read of the year. He credited Anthropic’s CEO for transparency in admitting they hold back the most powerful models until precautions exist — “an acknowledgement that this is a dangerous technology.” But he can’t square the utopian “everyone will have money and free time” story with how power actually concentrates: “having people without money is a great way to control the population.” What boils his blood most: the technology was “developed on the back of humanity’s collective knowledge and didn’t pay for it” — copyright, music, books, art became fodder, and only the New York Times-scale players can sue or license; the individual artist gets nothing. He named the structural risk concretely (one person controlling communications, energy, satellites, AI, “more power than a nation-state”).
He pulled back from the brink — “forget I said that… let’s figure out how to make life wonderful with this” — and landed on the only answer he trusts: community and leverage. “When we have enough people wanting to do something, we do it together, we’ll figure out a way.” Donald closed it cleanly: “The more powerful a tool is, the more it cuts both ways — the more potential for good and harm.”
💡 What This Means for You
The defensive move isn’t to opt out — it’s to get financially strong with these tools while they’re cheap, and to build community. Leverage and relationships are the moat that concentration of power can’t easily take.
Community Corner
- Joanna J is publicly committing to systemizing her business after naming manual work as her five-year stagnation point — starting with receipts-to-taxes via Claude Code.
- Don Back is cloning his voice for explainer videos and shared a fully-worked Final Cut production SOP (written by Claude) plus the chat-export extension fix.
- Scott Delinger keeps surfacing the “simple, specific, non-programmatic” wins — this week, finding one slide across three dozen decks. Also wrote a tiny command to refresh his three local model weights daily.
- Donald Kihenja brought Pictory.ai (audio → synced slides/b-roll) and the session’s closing wisdom on double-edged tools.
- Dirk Ohlmeier issued the unofficial group challenge from chat: “Produce a complete product within 1 million tokens.”
- Kasimir connected the cheap-model technique to Nate Herk’s dynamic-downgrade approach.
Links Shared in Chat
- Gears/Hub à la carte offer: https://coachlou.com/fix-that-roof
- Teaching-block preview (here.now): https://presto-birch-k8xt.here.now
- Pictory.ai (Donald — audio-to-video slide/b-roll sync): https://pictory.ai
- Nate Herk (Kasimir/Elizabeth — YouTube AI workflows)
- DSPy (auto-recursive prompt optimization, paired with cross-tier prompting)
- Don’s “Chat MD Export — Download” SOP (shared in chat + Telegram — Chrome extension for full backend chat export, Obsidian-ready YAML)
Try This Before Next Session
Take one task you currently run on Opus (or your premium model) and try to move it down a tier. Hand the premium model this instead:
“I’m going to run this task on Haiku. Knowing that model’s specific capabilities and limits, write a prompt that gets it to perform this task as well as you would — include the strategy, when to think step-by-step, and anything it’s likely to get wrong without being told.”
Run Haiku with that prompt. Compare to your old output. Where it holds up, you’ve just made that task ~10× cheaper — and you’ve felt the “pay once for reasoning, reuse it forever” pattern in your own hands.
Open Threads
- Plugin marketplace — Lou aims to have the version-controlled skills/plugins repo ready Monday; open question is getting plugin skills to surface under
/like native ones. - Capability-Architect + inference-router — the compiler and the shared routing skill are early-days; queued for design review (PRD).
- Teaching blocks — the 8 teaching blocks behind the deck go up on the here.now hub (≈ next day), each showing the process behind one piece of the architecture.
- Gears coaching cohort — Lou wants 4–5 to greenlight the six-month track; decision this week, start next week.
- Artifact export — chat-export tools still don’t capture artifacts; unsolved.
Next session: 2026-06-18
Derived Artifacts
- cheap-model-prompt (Cheap-Model Prompt — use a strong model to write a prompt optimized for a cheaper model)
- capability-architect-prd (Capability-Architect PRD — a compiler that turns a workflow into an inheritable skill bundle; pending design review)