2025-07-17 AI Mastermind
Table of Contents
- Insight - Local RAG Plus Remote Inference - The Data Privacy Architecture for Coaches
- Insight - Prompt Length and Latent Space - Short Prompts Explore, Long Prompts Execute
Session Overview
Lou opened by sharing his discovery of Kimi — a Chinese AI with notably natural writing style that surprised him even without customization. The group discussed how the AI model landscape continues to shift rapidly (Runway, Heygen, Synthesia, Flux, Kling, Fal — just on the video side), with the challenge being not finding good tools but choosing which ones to commit time to learning.
The session’s technical centerpiece was Lou’s live demonstration of Open Web UI running a side-by-side comparison: Llama running locally on an M4 Mac Mini versus the same model running inference on Groq. The speed difference was stark and led to a clear architectural recommendation: for coaches and consultants who want data privacy without sacrificing AI performance, the local RAG + remote inference model is practical, affordable, and now accessible without deep technical expertise.
The second major thread was Dirk’s breakthrough moment: after weeks of struggling with company research prompts, he tried framing the AI as “an investor who hired Bain Capital to research companies for their portfolio” and received dramatically better results. Lou used this as the hook for an extended explanation of latent space navigation — how different role assignments activate different knowledge clusters in the model, and why prompt specificity is a filtering decision rather than a length decision.
Don Back also shared his evolving content workflow, which sparked a live coaching moment from Lou: take your before/after editing pairs and train the model on your voice, so the first draft arrives closer to publication-ready rather than requiring extensive revision.
High-Signal Moments
- Lou’s live demo: Groq-accelerated inference versus local M4 Mac Mini — Groq responded in seconds while local was still generating minutes later
- The architecture insight: data stays local (RAG database on your machine), only context is sent for inference — “ephemeral” processing with no storage at the provider end
- “Put it on a Digital Ocean droplet for $5-6/month and offer your clone to your clients” — the commercialization angle for the local AI stack
- Dirk’s “investor framing” breakthrough — role assignment changed his company research outputs dramatically
- Lou’s explanation of latent space: a short vague prompt searches broadly; a specific role assignment narrows the search to the relevant expertise cluster
- The six-section prompt framework: instructions, goals, context, constraints, examples, role — each adds a filtering layer
- “Write a haiku” versus “write a poem” — the same content request with radically different latent space footprint
- Anthropic’s system prompt is apparently 24 pages — the benchmark for what production-level prompt engineering looks like
- Don’s content workflow and Lou’s suggestion to train the model using 10-12 before/after editing pairs to generate a personal voice profile
Open Questions
- How should coaches who work with highly sensitive client data (executives, therapy-adjacent work) evaluate their current AI tool data practices?
- As Open Web UI and similar tools mature, what’s the realistic path to offering a “personalized AI clone” as a coaching product extension?
- What is the right cadence for updating a personal voice profile as your writing evolves?
- At what prompt complexity does it make sense to invest in building a formal meta prompt versus continuing to prompt interactively?
- How does the “role assignment” principle extend to complex coaching conversations — what role should you assign AI when doing client strategy work?
Suggested Follow-Through
- Explore Open Web UI installation (or commission it via hire-and-record approach): get a local RAG instance running with Groq API connection
- Audit what client-sensitive data is currently flowing through commercial AI tools
- Apply the role assignment principle to your most-used AI workflows this week — test one short and one role-assigned prompt on the same topic and compare outputs
- Don: complete the voice profile training experiment (10-12 before/after pairs → style guide → test on new draft) and report back
- Explore Kimi (kimi.ai) for writing tasks — Lou’s recommendation for its unusually natural writing style
Additional Resources
Links & Tools Shared in Chat
- KlingAI — video generation tool, mentioned alongside Heygen and Runway as part of a discussion on AI video realism (mentioned by Bally Binning)
- OpenWebUI — local AI interface that can run models like Manus locally; Donald used it with Manus and found it excellent (mentioned by Donald Kihenja)
- Groq — accelerated AI inference platform; discussed in context of Lou’s live speed comparison with local M4 Mac Mini inference (mentioned by Donald Kihenja)
- Sidekick browser — Arc/Dia alternative for non-Mac users (mentioned by Donald Kihenja)
- Don Back’s dance video example — YouTube link to an AI-generated Fred Astaire-style deepfake video illustrating realistic AI video quality — https://www.youtube.com/watch?v=ELJhKli-dmk (shared by Don Back)
Ideas from Chat
- Julia McCoy and Julian Goldie — YouTube creators cited by Donald as examples of creators now using AI avatar versions of themselves with disclosure; relevant for members considering AI video as a content format (mentioned by Donald Kihenja)
- Donald Kihenja: “The surprising thing about AI is the new and unexpected need to catch up on catching up” — a precise observation about the compounding nature of AI learning debt; each week of not keeping up makes the next catch-up harder, not just incrementally but recursively
- Donald on “the infinite prompt” giving amazing results — reinforcing that the framework introduced in June 5 session is delivering real-world results for members actively using it
- Don Back on Dia vs. Arc: replaced Arc with Dia but hasn’t used Dia much yet; Lou and Don both keep both browsers — useful data point on the transition curve for members on the waitlist
Derived Artifacts
- content-pipeline (Content Pipeline — Don Back’s LinkedIn content machine)
- investor-lens (Investor Lens — role constraints and non-obvious analysis)
- voice-profile-builder (Voice Profile Builder — building voice profiles from draft/edit deltas)