April 23, 2026 — AIMM Mastermind

“The goal isn’t a chatbot that sounds like you. The goal is a system that decides like you. That’s the difference between building a golf cart and building a Ferrari.” — Lou

This Week in 30 Seconds

Anthropic’s Compute Crunch — Why Opus 4.7 feels worse, the hidden 50% tokenizer tax, and Lou’s case for staying the course instead of platform-hopping.
The Local-First Stack — Why your knowledge should live on your disk, not in a platform — and how CLAUDE.md + Obsidian + Claude Code form the right foundation.
The Cognitive Twin — Lou’s DSPy-powered system that mines your daily AI conversations to build a model that replicates your decision-making, not just your writing voice.
Code vs. Inference — The underestimated efficiency lever: Scott ran 90 million records down to 128K with Python + Gemini; Lou’s Gears build went from 6 hours to 15 minutes with the same principle.
GEARS Wins — Don’s site landed a discovery call from Ghana the day after launch, and multiple members are in the final stretch of integration.

Topic 1: Anthropic’s Compute Squeeze — and the Case for Staying the Course

Lou opened with a frank read on Anthropic’s current situation. When Dario Amodei declined to invest aggressively in compute infrastructure, it was a defensible bet. Then Claude Code became the de facto coding agent for professional developers, the “Department of War fiasco” drove ~2.5 million users from ChatGPT to Claude, and Anthropic found itself with demand it didn’t have the compute to serve.

The downstream effects: GPU capacity locked up by competitors, throttled API access, usage limits reduced without public announcement, and Claude Opus 4.7 shipping with a new tokenizer that generates ~50% more tokens for equivalent output — without a price change or a press release.

The takeaway isn’t to leave Claude. It’s to be deliberate about what you adopt and when. Opus 4.6 and Sonnet 4.6 still do everything this group needs. For tinkering, try everything. For workflow you depend on, wait for stability.

The deeper principle: don’t platform-hop. Every move carries a switching cost — chats that don’t transfer, knowledge locked in one platform’s memory, habits built around one model’s quirks. The horses are leapfrogging. The one you’re on will catch up.

→ Deep Dive: Insight - The Platform Loyalty Principle — Don’t Platform-Hop When AI Models Are Leapfrogging

Resources: AI Finance & Models — Founder Reality (Scott Delinger)

Topic 2: The Local-First Knowledge Stack — Your Intelligence Shouldn’t Live in a Platform

The most important architecture decision right now isn’t which AI to use — it’s where your knowledge lives.

Chat history lives in the chat platform. Artifacts from Claude web live in a virtual sandbox that disappears when the session ends. Memory built up in one tool doesn’t transfer to another. Every platform switch resets intelligence you thought you owned.

Lou’s answer: keep everything that matters on your local disk. Use platforms as interfaces to that data rather than custodians of it. The architecture: Claude Code as engine, Obsidian as viewer, a folder-based wiki as the knowledge layer.

The resolver pattern makes it work: CLAUDE.md contains pointers, not intelligence. When you want to do X, check that file on disk. The intelligence travels with the file. If Claude becomes the wrong tool, any AI that can read files inherits your entire knowledge environment immediately.

Right surfaces for the right work: Chat for brainstorming and quick artifacts. Cowork when you want outputs saved to your machine. Claude Code for anything you want to maintain and build on over time. The key question before any AI interaction: Can I save this output?

→ Deep Dive: Insight - Platform as Interface, Not Custodian — The Resolver Pattern for Portable AI Intelligence

Topic 3: The Cognitive Twin — Teaching AI to Decide Like You

The centerpiece of the session: a 6-minute NotebookLM-narrated video summarizing Lou’s cognitive twin architecture, built on DSPy.

Every AI interaction generates high-quality training data — and most people discard it. Every correction you make, every suggestion you override: that’s a decision instance. A before-and-after pair. The cognitive mirror harvests this in two modes:

Mine mode — scans conversations for the structural principles behind decisions
Harvest mode — captures correction moments as clean before/after training pairs

At 40–50 decision instances, DSPy runs. Instead of writing the perfect prompt, it tests thousands of instruction variants against your decision data and finds the one that reliably reproduces your judgment. The optimizer replaces the guesswork. It makes prompt engineering empirical.

The three-layer architecture: DSPy modules at the base (expert reflexes), cognitive profile in the middle (how you reason), knowledge vault at the top (your IP and frameworks). Not an AI that sounds like you — one that decides like you.

Scott Delinger’s observation: this is Drucker’s feedback analysis method from Manage Oneself, automated. What Drucker practiced manually, the harvest mode executes on schedule.

The session-to-video pipeline: cognitive mirror skill scans the session → extracts decisions and architecture → writes narrated script → NotebookLM generates audio. A 6-to-8-hour session becomes a 6-minute video in ~30 minutes.

→ Deep Dive: Insight - The DSPy Cognitive Mirror — Teaching AI to Replicate Your Decision-Making

Resources:

Topic 4: Code vs. Inference — The Efficiency Lever Most People Ignore

Two parallel stories from the same session, pointing at the same principle.

Scott’s story: 90,900,000 entries, needing reduction to 128,000 usable records. He described the data structures to Gemini. Gemini wrote Python. The script ran overnight on a server in Ontario: 10 hours 16 minutes, 500MB RAM at peak. Data going back to 1993 for some datasets. “This would not have been possible without the collaboration with Gemini on writing Python.” Output: a CSV, portable, feedable into anything.

Lou’s parallel: Gears pipeline — content generation, website generation, ontology processing — all originally routed through Claude inference. Full ingest: 6 to 7 hours. The fix: Claude handles only routing and evaluation; Python handles computational work; Claude spawns the Python batch processor at the right point. Same job: 15 minutes.

The generalizable principle: inference is for judgment. Code is for computation. Scripts can live inside skill definitions — embedded utilities the skill spawns at the right moment, zero API cost for the computation.

The decision filter: Does this task have a deterministic correct answer? If yes, it belongs in code. If it requires judgment, that’s where inference earns its price. You don’t need to know Python — you need to describe the task clearly enough that AI can write it for you.

→ Deep Dive: Insight - Code Is for Computation, Inference Is for Judgment

Community Corner

Don’s GEARS site landed a discovery call in under 24 hours. Lou and Don went live Monday. Tuesday, a prospect from Ghana booked a strategy call — found Don through AI-assisted research on graduate studies topics. Not the right fit for Don’s current clientele, but the signal is unambiguous. “Apparently it’s working,” Don said, with appropriate understatement.

Scott ended a three-conversation standoff with two parallel documents. Client had sold a business to a buyer who assumed his engineering background made him internet-savvy. Three frustrated conversations in, Scott sat down with Claude during Game 1 of the Oilers playoffs. Result: two structurally parallel documents — one technical, one near-layman — covering the same domain transfer process. The buyer’s reply: “Kathy and I like very much what you have created.” Scott’s observation: time to switch from hourly to outcome-based billing.

Happy birthday, Kasimir Hedström (April 22). Kasimir is building his LinkedIn presence in the AI space and could use some algorithmic help — likes and reshares matter at early stage. Connect: linkedin.com/in/kasimir-hedstrom

Elizabeth Stief is migrating everything from Claude Chat to Cowork + Code folder system. “Lots of work though.” The switching cost is real — but she’s paying it once.

Links Shared in Chat

AI Finance & Models — Founder Reality (Scott Delinger)
Kasimir Hedström on LinkedIn (Kasimir Hedström)

⚡ Try This Before Next Session

Turn one important chat session from this week into a shareable summary, then run it through NotebookLM. Find a conversation where you built something, made decisions, or did substantial research. Copy the full chat text. Open a new Claude session, use the Chat-to-Script Extractor command below, and take the resulting script to NotebookLM. The whole exercise takes ~20 minutes. What you usually lose in 48 hours stays accessible indefinitely.

🧰 Commands Extracted This Session

dual-audience-explainer — Write the same explanation at two technical levels simultaneously, with parallel editing locked across both versions. From Scott’s business-sale client situation.
chat-to-script-extractor — Extract a shareable narrative script and key decisions from a long AI work session, ready for NotebookLM or slides. From Lou’s cognitive twin pipeline.

Open Threads

Lou is ~1–2 weeks from accumulating the 40–50 decision instances needed to run the first full DSPy optimization pass. Follow up in a future session on the initial results.
Elizabeth’s migration from Chat to Cowork/Code is in progress — her experience will be useful signal for other members considering the same move.
Several members approaching the GEARS integration finish line; more launch stories expected in coming sessions.

Next session: Thursday, April 30, 2026

[← Previous: 2026-04-16_Mastermind] · Sessions Index

PowerUp Coaching — Living Knowledge Base

Explorer