“The deliverable could be the folder that I actually used to create this thing. You would have my entire experience in your hands.” — Lou


This Week in 30 Seconds

Lou opened with a new delivery model for AIM member assets: stop shipping polished articles, start shipping the entire R&D folder — chat exports, intermediate artifacts, the resulting skill, the teaching block, and a .claude configuration that turns the folder into a queryable agent. The shift collapses prep cost and gives recipients the messy middle, which is where most of the transferable learning lives. The session then went deep on three threads that supported the same architecture: Kasimir’s three-layer memory setup (local Obsidian → Notion → Pinecone), Lou’s walkthrough of how he derived the Modal Subtraction algorithm from Meta’s LCM paper, and a candid back-and-forth on MCP vs API vs Skill as patterns for exposing your IP to clients’ AI. Dirk’s frustration with looping AI iterations produced the cleanest copy-paste artifact of the session — a six-failure-mode audit phrase that Lou keeps loaded as a standing command, plus a complementary persona-bound audit pattern. Closed with model-tier comparisons (Opus 4.7 wins for production work) and Bally’s spin-off idea: bi-weekly content explaining the EU AI Act disclosure requirements for client websites.


1 — Ship the Folder: A New Delivery Model for AIM Assets

Lou opened the call describing a shift in how he plans to deliver member assets. The pattern most experts default to is: spend a week solving a problem, then spend another week packaging the solution as a polished article and a clean skill. Lou’s new pattern collapses both into one step. Every working session he runs already produces chat exports, intermediate artifacts, a final skill, and (via an automated teaching-block skill) a step-by-step narrative of what happened. The new move is to ship the whole folder — chats and all — with a .claude configuration that turns it into an interrogable knowledge object.

Scott named the value clearly: “for me, the messy middle is really part of the learning process right now.” A polished article hides the holes the author fell into. A shipped folder lets the recipient ask Claude any question about the actual journey — what was tried first, what was rejected, why the final approach won. Three different recipients can extract three different lessons from the same folder, because the folder is dense enough to support multiple readings.

Don Back picked up the implication for personal knowledge work: this same pattern solves the “chat-sprawl” problem most members face — where one inquiry forks into 4–6 chats that no one can later reconcile. Roll the whole exploration into a folder, attach a .claude, and the folder becomes navigable forever.

Deep Dive: Insight - Ship the Folder, Not the Polished Article — Your R&D Trail Is the Deliverable


2 — The Hot-Cache, Wiki, Semantic Memory Stack

Kasimir walked through his memory architecture: three skills, one each running at end-of-chat or near the context limit. Layer 1 lives in his local Obsidian vault. Layer 2 pushes selected content to Notion. Layer 3 pushes to Pinecone for semantic retrieval. “Three skills, and everything’s taken care of.”

Lou expanded this into the general pattern. A single store cannot serve all access patterns. Recent context wants to be grep-fast (hot cache). Medium-term knowledge wants to be link-traversable (wiki). Long-term archive wants to be semantically searchable (vector store). He named the wall he is hitting in his own Karpathy-style wiki — at ~800–900 entries, the in-line link-graph updates are becoming slow enough that he is now considering moving the knowledge graph off the document frontmatter and into a meta-layer that just points at documents. Pinecone, he noted, has just released a product that seems to fill exactly this gap (wiki-style retrieval over a semantic backend) — worth watching but not yet evaluated.

The non-obvious move he flagged: the hot cache is not just “recent files” — it is the file Claude reaches first, configured via the project’s CLAUDE.md so the agent knows where to look without being asked. Anything Claude does not find there, it escalates to the wiki. Anything not in the wiki, it escalates to the semantic archive. The escalation order is the architecture.

Deep Dive: Insight - The Hot Cache, Wiki, Semantic Memory Stack for AI-First Workflows


3 — Dunbar’s Number for Wikis

Scott surfaced an analogy that the group lingered on: is there a Dunbar’s number for wiki entries — a size beyond which the structure stops cohering? Don Back gave the org-design parallel: in business, when a unit grows past ~80–90 people, you split it and introduce structure. Same diagnostic for wikis. When the wiki gets too big to communicate with itself, the answer is not “find a better tool” — it is to split into smaller wikis joined by structure (cross-wiki links, a meta-index, or thematic grouping). Lou agreed in principle and noted this is mostly a function of the rules you give the AI that maintains the wiki — you can tell it to split when categories exceed N entries, in the same way you would tell an org to split when a team exceeds N people.

No standalone insight created — this is structural advice that sits inside the broader hot-cache/wiki/semantic stack. Flagged for review in case it recurs and earns its own page.


4 — Modal Subtraction: From Meta’s LCM Paper to a Standalone Skill

Lou used the second half of the call to walk through the multi-day exploration that produced his Sakana LCM / OutsideTheModal skill. The thread started with Meta’s Large Concept Model paper — the claim that “LLMs predict words; LCMs predict ideas” — and ended with a clean three-stage algorithm that Lou now ships as a skill.

The exploration arc: Lou wanted “an idea generator.” He fed the paper to Claude. Claude pushed back — concept models are not yet competitive with frontier models; treat the architecture as inspiration, not implementation. Lou asked Claude to read the LCM repo. Claude confirmed: research scaffolding, not deployable. Then came the pivotal move — Claude proposed extracting just one element (the sonar-embedded substrate) and overlaying it on Lou’s existing Sakana skill. From there the algorithm crystallised: generate ~20 candidates, surface what an LLM would typically say, subtract anything that overlaps the modal response, then run the survivors through a skeptic gauntlet to confirm they are defensible rather than just rare.

What emerged is the first explicit median-strip step in the vault’s anti-modal toolkit. Insight - Latent Terrain Cartography — Navigating Off-Modal AI Responses to Find Non-Obvious Ideas established the discipline of leaving the modal on purpose; this session turned it into a runnable algorithm. The skill produced two artifacts — Sakana LCM (the algorithm itself) and Expert Mind (a five-component business architecture audit that uses the algorithm) — and both ship in the session’s R&D folder along with the chat exports and teaching block.

Deep Dive: Insight - Modal Subtraction — Generate, Strip the Median, Skeptic-Test the Outliers


5 — MCP vs API vs Skill: Choosing How to Expose Your IP

Dirk asked whether AIM members should be packaging their offerings as MCP servers that clients connect to their own Claude. Lou’s answer was the full decision framework. APIs are fastest for single-function transactions and best at hiding IP. MCPs are a multi-service connector — slower per call than raw APIs but vastly more convenient when you have many related services and a client who will use them often. Skills ship your workflow, not just your responses — readable by the client, which is sometimes a feature (trust signal) and sometimes a liability (methodology gets copied).

The mature pattern, Lou noted, is all three at once: an API holding the core IP, an MCP wrapping the API as a connector, and a Skill wrapping both in a workflow non-technical clients can install. Gears uses skill-over-API; Janssen uses MCP. The right choice is not about technology — it is about audience sophistication and IP-protection requirement.

Lou added the forward-looking framing: design your interfaces for agents, not humans. The consumers of your endpoints in 12–18 months will mostly be other agents. Build well-typed contracts, not pretty UIs.

Deep Dive: Insight - MCP vs API vs Skill — Three Patterns for Exposing Your Knowledge to Clients’ AI


6 — Context Degradation, Handoffs, and the Sunk-Cost Trap

Dirk surfaced a frustration that several members shared: the long-conversation spiral — Claude Chat → Cowork → Claude Code, each pass producing new “improvements,” none ever finishing. Lou’s three-part response:

Models degrade past ~50% context. Reliability drops noticeably past the halfway point. Either use a hook/timer to auto-compact at 50%, or use the handoff skill (in the AIMM GitHub) — which summarises current context to a file, lets you start a fresh chat, and then reloads from the file. Cleaner than /compact.

Recognise the sunk-cost trap. When you are bent on making one chat work, the AI happily compounds your investment. Sometimes the highest-leverage move is to throw the chat away and start clean. The signal: if the conversation has gone in circles for an hour, it is no longer producing.

Use the new /goal slash command. Claude now ships with /goal, which loops auto-regressively until the goal is met or a budget is exhausted. The recommended pattern: have Claude first draft a goal specification (criteria for success), then feed that into /goal with an explicit turn limit. Without a budget, it will exhaust your tokens. With one, it produces tight focused work.

Kasimir added a complementary pattern: an ROI-bounded improvement loop. Tell the AI “iterate X rounds, or until ROI drops below N%.” It naturally halts when marginal returns flatten — which is often after fewer rounds than you would have set manually. He also recommended the open-source obra/superpowers skill repo as a spec-driven step-by-step development pattern worth trying. (Lou has not installed it; he has a similar pattern embedded in his own pipeline.)


7 — The Audit Pair: Universal + Avatar

Dirk’s iteration frustration produced the cleanest copy-paste artifact of the session. Lou shared the exact phrase he has programmed into his Claude:

“Audit this for errors, omissions, oversights, duplications, contradictions, and — if applicable — areas of improvement that meet our criteria.”

The phrase replaces improve (which is additive and produces longer, weaker work) with six named failure modes the model must actively search for. The “meet our criteria” clause on the improvements pass is load-bearing — it forces the model to check against an existing standard rather than inventing new ones.

He paired this with a second move: audit from the avatar’s eyes. Give Claude a constrained persona (solo coach, two years in, 30 minutes/day, has been burned by hype) and ask it to review the work as that person would. What’s useful? What’s motivating? What would they skip? What would they screenshot? The two audits cover orthogonal failure modes — Universal catches internal problems; Avatar catches fit problems with the actual reader.

Deep Dive: Insight - The Universal Audit Phrase — Errors, Omissions, Oversights, Duplications, ContradictionsDeep Dive: Insight - Audit From the Avatar’s Eyes — Persona-Bound Quality Review


8 — AI for Tax & Multi-Year Workflow Inheritance

Joanna’s question on using AI for tax prep prompted a substantive aside from Lou. His experience: extraction from bank statements is excellent; categorisation needs to be taught. The workflow he settled on: feed the AI the prior year’s general ledger so it can infer category-to-vendor mappings from real history, then have it produce single-entry accrual records (since QuickBooks infers the contra-side). The non-obvious step: explicitly tell the AI to save what it learned to a file. Otherwise next year’s run loses everything. Lou’s setup keeps the learning notes and the JSON category-mapping DB at the accounting/ parent folder level, with year-specific subfolders inheriting the rules. Inherit-down CLAUDE.md as a multi-year discipline.

No standalone insight page — flagged for review as a candidate; recurring use across multiple domains (tax, client onboarding, content production) may earn its own page later.


9 — Bot Intrusion: The Coming Decade of “Is That Really You?”

At the start of the call, an unidentified bot joined and refused to identify itself. Lou kicked it. The reflection that followed landed cleanly: “this is going to be the funkiest time of all — when we don’t know if we’re talking to real people or bots. Remember the early mobile-phone decade of ‘can you hear me, can you hear me?’ This is going to be a decade of ‘is that really you, is that really you?’” No insight extracted but worth marking — the impromptu first-line authentication problem is now part of running meetings.


10 — Bally’s EU AI Act Bi-Weekly

Bally announced she is starting a bi-weekly content series on what businesses must declare on their websites under the EU AI Act, which comes into force in a few weeks. She has been doing dense research and building prompts to walk businesses through the disclosure requirements. The session shape — ship-the-folder, audit-pair, ROI-loops — fits her use case directly. Flagged as an asset she could spin into a member-facing skill.


11 — Model Tier Discussion: Opus 4.7 vs Sonnet vs Haiku

Closing thread. Lou downgraded from Max to Pro for a month to feel the limits — confirmed Max-100 is ~5x the Pro plan and Max-200 is ~20x. His preference: Opus 4.7 for anything that produces member-facing deliverables. Don uses Opus for creative/thinking and Sonnet for routine. Kasimir uses Opus for planning, Sonnet for execution, and notes that Claude’s adaptive mode now downgrades automatically when Opus is overkill — he watches the indicator switch between Opus and Sonnet mid-task. Lou’s punchline: “put it on Opus 4.7 and forget about it.” For $100–200/month, the smartest model on the planet is not a hard expense to justify.


Community Corner

  • Bally announced her EU AI Act bi-weekly content series.
  • Don Back offered to share his Rod Brooks / Roomba inventor dinner story in a future session (the reference came from Scott’s joke about Don needing a “Roomba.md agent” to corral his chats).
  • Donald Kihenja kept the chat lively with multi-language asides — Swahili (~30% Arabic vocabulary), Spanish, and the Alkamiya / Alchemy connection prompted by Don’s “lead into gold” reaction.
  • Scott had to drop early for another meeting.
  • Joanna observed without speaking this week.


Tools & Assets

This session shipped the first instance of the folder-as-deliverable pattern — Lou’s LCM-vs-LLM/ working folder, containing:

  • Chat exports from three sessions (the original LCM exploration, the skill-design pass, the teaching-block pass)
  • Two skills: Sakana LCM (the modal subtraction algorithm) and Expert Mind (a five-component business architecture audit that uses Sakana LCM internally)
  • A summary article on LCMs vs LLMs
  • A teaching-block walkthrough of the entire process
  • A README and .claude configuration for navigation

The folder will be zipped and posted to the AIM GitHub once Lou finishes the polish pass over the weekend.

No new vault-level commands or skills extracted this session — the prompts and patterns surfaced (Universal Audit Phrase, Avatar Audit, ROI-bounded loops) belong in CLAUDE.md as standing constraints rather than as new commands, since they are already covered by existing commands like /plan_audit, /change_audit, and /skeptic.


Try This Before Next Session

  1. Paste the Universal Audit Phrase into your CLAUDE.md today. Use it on your next AI-generated draft and notice whether the work tightens or just lengthens.
  2. Write one constrained avatar block in your ~/voice/ or .claude/avatars/ folder. Make it tight, specific, and resource-aware — not aspirational.
  3. Pick one chat that has gone in circles for >1 hour and start it over. Use the handoff skill or summarise into a fresh chat. Notice how much momentum you reclaim.
  4. Audit your current knowledge base for which layer it actually serves. If it is one tool trying to be hot cache, wiki, and semantic archive at the same time, sketch the split.

Open Threads

  • The Pinecone product Lou flagged (wiki-style retrieval over semantic backend) — worth a deeper evaluation; could obsolete the Karpathy-style LKB pattern if it works.
  • The “Dunbar’s number for wikis” question — what is the actual size threshold for splitting? Don Back’s instinct (80–90 in orgs) needs an analogue for knowledge bases.
  • The MCP-on-client’s-Claude pattern that Dirk raised — concrete examples of AIM members who have shipped one would inform the next discussion.
  • Bally’s EU AI Act content series — could it be spun into an installable skill that walks any business through their disclosure obligations?

← Previous: 2026-05-07_Mastermind

Derived Artifacts

  • mastermind-triage (/mastermind-triage — interactive triage command for flagged candidates in raw/review/; proposed by Lou during the post-ingest review of this session’s flagged items)