Original Insight
“The exact same system up-leveled itself 10 times just by changing the language model from Llama 3.2 to the GPT OSS.” — Lou
Expanded Synthesis
There is a seductive trap in the world of AI tools: believing that the infrastructure, the interface, the prompt library, or the clever workflow is the source of the results. Lou’s August 7 session dismantled that assumption in real time.
Lou had built a sophisticated RAG (Retrieval-Augmented Generation) legal analysis system. He had optimized the database architecture, crafted layered system prompts, tuned the chunking strategy, and deployed a multi-pass retrieval process. The system was working — but it wasn’t impressive. The client said it felt “a little light.” Lou himself admitted he wasn’t sure it was worth $40,000. He was preparing to implement knowledge graphs, contextual retrieval layers, and long-context workarounds.
Then he swapped the underlying language model. Same architecture. Same prompts. Same database. Different model. The output went from a cursory list of observations to a thorough legal battle plan — complete with federal case citations, witness credibility analysis, a courtroom strategy framework, and anticipatory objections to the AI’s own recommendations.
The lesson is subtle but profound: the model is the reasoning engine, and reasoning quality is not linear. More parameters does not mean incrementally better; at a certain threshold, qualitative capability jumps. The jump from a solid 70B open-source model to a frontier-class model is not 10% better — it can be categorically different. The same questions get interpreted at a different level of intent, context, and strategic depth.
For high-performers and coaches, this maps directly to human leverage: the people, frameworks, and thinking partners you layer into your work are the multiplier, not the tasks themselves. You can have the most efficient calendar system in the world, but if the strategic thinking underneath it is shallow, the system produces efficiently mediocre results. The infrastructure serves the intelligence.
The coaching blind spot this surfaces: we spend enormous energy optimizing visible systems — our processes, our tools, our workflows — while neglecting to upgrade the underlying intelligence operating those systems. This applies to AI model selection, yes. But it also applies to who we hire, whose counsel we seek, which thinking frameworks we use, and what quality of information we feed our own decision-making.
A secondary insight from this session: using one AI to stress-test another (Claude testing Gemini, Gemini testing Claude) produces more impartial quality control than using either one in isolation. This mirrors a coaching truth: clients who have multiple trusted advisors — who aren’t all aligned to the same worldview — tend to make fewer blind-spot-driven decisions.
For PowerUp clients navigating AI adoption, this matters because the temptation is always to optimize what’s already in front of them: better prompts, better organization, better workflows. The higher-leverage question is: are you working with the right underlying intelligence — in your AI tools, your team, your advisory circle, your mental models?
Practical Application for PowerUp Clients
The Multiplier Audit
Have clients work through this three-layer inventory:
Layer 1: Your AI Infrastructure
- Which models are you actually using day-to-day? (not just which apps)
- When did you last benchmark your current setup against a frontier alternative?
- Are you running a Ferrari engine or a sedan engine inside your workflow?
Layer 2: Your Human Intelligence Network
- Who are the three most influential thinkers you regularly learn from?
- Are any of them actively challenging your assumptions — or just validating them?
- Is your advisory circle diverse enough to surface your blind spots?
Layer 3: Your Own Mental Models
- What are the top three frameworks you use to make decisions?
- How old are those frameworks? Were they formed before the current pace of change?
- Where are you over-relying on process and under-investing in thinking quality?
Journal Prompt: “Where in my work am I optimizing the system, when the real constraint is the quality of intelligence running the system?”
Coaching Question Set:
- If you could upgrade one thing in your business right now — not your tools, but the quality of reasoning applied to your business — what would it be?
- What would you be able to do if you had access to 10x better strategic thinking on your biggest current challenge?
- Where are you solving a model problem with an interface fix?
Additional Resources
- The Inevitable by Kevin Kelly — on understanding technology as leverage
- Thinking in Systems by Donella Meadows — on what actually drives system outcomes
- Insight - Build Tiny Tools That Remove Real Friction — on identifying where leverage actually lives
- Insight - Codify Your Judgment Into Skills, Not Just Prompts — on the intelligence layer behind the automation
Evolution Across Sessions
The Aug 21 session established “trust before automation” — the idea that relationship quality determines what you can build on top of it. This insight extends that: even after trust is established and automation is built, the quality of the intelligence running the automation determines outcomes. The Aug 14 session deepens this further by exploring what happens when the intelligence layer (RAG architecture) hits structural limits — and how much pre-processing intelligence you invest before retrieval determines what can be found.
Next Actions
- For me (Lou): When introducing new AI tools to PowerUp clients, lead with a model comparison demo — not just an interface walkthrough. Show the reasoning quality difference, not just the feature difference.
- For clients: Build a quarterly “intelligence audit” habit — reviewing whether the thinking tools and advisors underlying their decisions are still at the frontier for their domain.