2025-07-17 AI Mastermind

Insight - Local RAG Plus Remote Inference - The Data Privacy Architecture for Coaches
Insight - Prompt Length and Latent Space - Short Prompts Explore, Long Prompts Execute

Session Overview

Lou opened by sharing his discovery of Kimi — a Chinese AI with notably natural writing style that surprised him even without customization. The group discussed how the AI model landscape continues to shift rapidly (Runway, Heygen, Synthesia, Flux, Kling, Fal — just on the video side), with the challenge being not finding good tools but choosing which ones to commit time to learning.

The session’s technical centerpiece was Lou’s live demonstration of Open Web UI running a side-by-side comparison: Llama running locally on an M4 Mac Mini versus the same model running inference on Groq. The speed difference was stark and led to a clear architectural recommendation: for coaches and consultants who want data privacy without sacrificing AI performance, the local RAG + remote inference model is practical, affordable, and now accessible without deep technical expertise.

The second major thread was Dirk’s breakthrough moment: after weeks of struggling with company research prompts, he tried framing the AI as “an investor who hired Bain Capital to research companies for their portfolio” and received dramatically better results. Lou used this as the hook for an extended explanation of latent space navigation — how different role assignments activate different knowledge clusters in the model, and why prompt specificity is a filtering decision rather than a length decision.

Don Back also shared his evolving content workflow, which sparked a live coaching moment from Lou: take your before/after editing pairs and train the model on your voice, so the first draft arrives closer to publication-ready rather than requiring extensive revision.

High-Signal Moments

Lou’s live demo: Groq-accelerated inference versus local M4 Mac Mini — Groq responded in seconds while local was still generating minutes later
The architecture insight: data stays local (RAG database on your machine), only context is sent for inference — “ephemeral” processing with no storage at the provider end
“Put it on a Digital Ocean droplet for $5-6/month and offer your clone to your clients” — the commercialization angle for the local AI stack
Dirk’s “investor framing” breakthrough — role assignment changed his company research outputs dramatically
Lou’s explanation of latent space: a short vague prompt searches broadly; a specific role assignment narrows the search to the relevant expertise cluster
The six-section prompt framework: instructions, goals, context, constraints, examples, role — each adds a filtering layer
“Write a haiku” versus “write a poem” — the same content request with radically different latent space footprint
Anthropic’s system prompt is apparently 24 pages — the benchmark for what production-level prompt engineering looks like
Don’s content workflow and Lou’s suggestion to train the model using 10-12 before/after editing pairs to generate a personal voice profile

Open Questions

How should coaches who work with highly sensitive client data (executives, therapy-adjacent work) evaluate their current AI tool data practices?
As Open Web UI and similar tools mature, what’s the realistic path to offering a “personalized AI clone” as a coaching product extension?
What is the right cadence for updating a personal voice profile as your writing evolves?
At what prompt complexity does it make sense to invest in building a formal meta prompt versus continuing to prompt interactively?
How does the “role assignment” principle extend to complex coaching conversations — what role should you assign AI when doing client strategy work?

Suggested Follow-Through

Explore Open Web UI installation (or commission it via hire-and-record approach): get a local RAG instance running with Groq API connection
Audit what client-sensitive data is currently flowing through commercial AI tools
Apply the role assignment principle to your most-used AI workflows this week — test one short and one role-assigned prompt on the same topic and compare outputs
Don: complete the voice profile training experiment (10-12 before/after pairs → style guide → test on new draft) and report back
Explore Kimi (kimi.ai) for writing tasks — Lou’s recommendation for its unusually natural writing style

Additional Resources

Links & Tools Shared in Chat

KlingAI — video generation tool, mentioned alongside Heygen and Runway as part of a discussion on AI video realism (mentioned by Bally Binning)
OpenWebUI — local AI interface that can run models like Manus locally; Donald used it with Manus and found it excellent (mentioned by Donald Kihenja)
Groq — accelerated AI inference platform; discussed in context of Lou’s live speed comparison with local M4 Mac Mini inference (mentioned by Donald Kihenja)
Sidekick browser — Arc/Dia alternative for non-Mac users (mentioned by Donald Kihenja)
Don Back’s dance video example — YouTube link to an AI-generated Fred Astaire-style deepfake video illustrating realistic AI video quality — https://www.youtube.com/watch?v=ELJhKli-dmk (shared by Don Back)

Ideas from Chat

Julia McCoy and Julian Goldie — YouTube creators cited by Donald as examples of creators now using AI avatar versions of themselves with disclosure; relevant for members considering AI video as a content format (mentioned by Donald Kihenja)
Donald Kihenja: “The surprising thing about AI is the new and unexpected need to catch up on catching up” — a precise observation about the compounding nature of AI learning debt; each week of not keeping up makes the next catch-up harder, not just incrementally but recursively
Donald on “the infinite prompt” giving amazing results — reinforcing that the framework introduced in June 5 session is delivering real-world results for members actively using it
Don Back on Dia vs. Arc: replaced Arc with Dia but hasn’t used Dia much yet; Lou and Don both keep both browsers — useful data point on the transition curve for members on the waitlist

Derived Artifacts

content-pipeline (Content Pipeline — Don Back’s LinkedIn content machine)
investor-lens (Investor Lens — role constraints and non-obvious analysis)
voice-profile-builder (Voice Profile Builder — building voice profiles from draft/edit deltas)

PowerUp Coaching — Living Knowledge Base

Explorer

2025-07-17_Mastermind