Insight: Run Your Prompt Through Multiple Models and Synthesize at the Top

Original Insight

“Don’t use only one model, because it has its blind spots. For example, ChatGPT was really good for teams, the Gemini came up with mathematical formulas that you can use in sales conversations… Those formulas wouldn’t have come to my mind or to my attention. But because Gemini is probably primed for coding, so then, of course, it will see things as a code, mathematical formulas.” — Kasimir

“Every LLM seems to be optimized for a different way of thinking, or it has its own paradigms, and it’s a great way to bring that — it’s like your board of directors, but on an LLM level.” — Lou

“Use different AIs with the same prompt, and see what they come with, and then maybe use a third one to give you the synthesis — what does it mean, which one is better, kind of approach.” — Kasimir

Expanded Synthesis

Every AI model has a cognitive fingerprint — a distinct set of strengths, priors, and reasoning styles shaped by its training architecture, data curation, and optimization objectives. ChatGPT skews toward team dynamics and relational framing. Gemini tends toward systematic, quantitative, and technical outputs. Claude tends toward nuanced reasoning, ethical dimensions, and synthesis. These are not limitations to work around; they are design features to exploit.

Kasimir Hedstrom demonstrated this principle by accident. He had two Ideal Client Profile (ICP) documents he was unsure how to integrate, so he ran the same diagnostic question through multiple models. ChatGPT told him to merge them for a unified team-oriented profile. When he added context that he was a solopreneur planning to run AI agents rather than human teams, the model’s response flipped entirely. Gemini, meanwhile, surfaced mathematical formulas for qualifying sales conversations — something that would never have emerged from ChatGPT’s relational framing. The synthesis from a third model then helped him navigate between these contrasting outputs toward the approach that fit his actual situation.

This is the multi-model synthesis pattern: run the same input through two or three models independently, collect their divergent outputs, then use a fourth (or a third, if working with three) to synthesize the results. Because the synthesizing model has no stake in any of the individual outputs, it evaluates more objectively. The result is something closer to a genuine panel review than to a single expert’s opinion.

Lou extended the concept by describing two variations of the pattern. The first is the parallel approach Kasimir described: models work independently, then a synthesis model evaluates all outputs. The second is sequential cross-critique: each model sees all other models’ answers, critiques them, and then a new round of synthesis begins. Each approach has different strengths. Parallel is faster and produces more divergent thinking. Sequential cross-critique produces more refined, mutually-tested conclusions.

For high-performers using AI in their coaching or consulting practice, this has immediate practical implications. Most practitioners default to a single model — often the one they know best or use most — and develop a sophisticated prompting relationship with it. This is valuable, but it creates model-specific blind spots. If your chosen model is optimized for relational thinking, it will systematically under-generate quantitative frameworks. If it skews toward big-picture synthesis, it may underweight implementation detail. The multi-model approach corrects for this structurally.

There is also a confidence dimension here that matters for coaching clients in particular. High-performers often struggle with decisions that have no clear external benchmark — strategic pivots, positioning choices, pricing decisions, client filtering. Running these through multiple models does not eliminate the judgment call, but it surfaces a range of perspectives that would otherwise require consulting multiple human experts. The cost drops to near zero; the quality of the deliberation rises significantly.

Kasimir also raised a critical quality-control mechanism: require citations. If a model cannot supply a URL or source for a factual claim, treat that claim as unverified. This is especially important in multi-model synthesis, where hallucinated facts from one model can propagate through the chain and end up in the final output dressed as consensus. Lou reinforced this with a validation loop concept: use Perplexity as a hallucination-checker, asking it to confirm that specific claims are grounded in retrievable reality. This separates the creative synthesis function from the fact-verification function — a useful division of cognitive labor.

Lou also noted a crucial technical caveat: calling an AI from code via API bypasses the system prompt that shapes the user-facing product. A custom GPT or Claude project has embedded instructions that shape its behavior. When you call OpenAI directly through the API, those guardrails don’t exist. This matters for anyone building multi-model workflows in N8N or similar tools: you need to include a system prompt in your API calls that replicates the environment you’ve been designing in.

Practical Application for PowerUp Clients

The Panel of Advisors Protocol:

Choose a significant decision or creative challenge — a positioning statement, a pricing model, a response to a difficult client situation, a coaching methodology question.

Write a clear, context-rich prompt that includes all relevant background.
Run it through Claude, ChatGPT, and Gemini simultaneously.
Collect the three outputs without reading any of them first.
Feed all three outputs to a fourth model (or back to one of the three after clearing context) with the instruction: “I received these three responses to the same question. Evaluate the strengths and blind spots of each, and synthesize the most useful composite answer.”
Evaluate the synthesis against your own judgment. The panel informs; you decide.

Coaching Questions:

Which AI model have you been defaulting to, and what are its likely blind spots given its design strengths?
What decision have you been sitting on that would benefit from three or four distinct analytical perspectives?
Where in your client work do you make implicit assumptions that an outside framework — even a mathematical or quantitative one — might surface more clearly?

Validation Checklist: Before acting on any AI output that contains factual claims or strategic recommendations:

Ask the model: “Which specific sources support this recommendation?”
If no URL is provided, verify independently or qualify the claim in your communication.
For high-stakes decisions, route factual claims through Perplexity for ground-truth validation.

Additional Resources

Superforecasting by Philip Tetlock — the evidence base for aggregating diverse perspectives in prediction
N8N multi-agent workflow documentation for automating the cross-model synthesis loop
Insight - Trust Before Automation in High-Value Relationships — related consideration: at what point does multi-model synthesis warrant human review before action?

Evolution Across Sessions

This builds on the group’s ongoing exploration of AI as a board of advisors — an idea Lou introduced in earlier sessions through the N8N multi-LLM workflow demonstration. Kasimir’s practical demonstration this session made it tangible and replicable. The session also introduced the Brooke Castillo model as a framework for coaching ICP work — worth tracking as a cross-pollination of established coaching tools with AI-driven client profiling.

Next Actions

For me (Lou): Run the next PowerUp positioning draft through Claude, ChatGPT, and Gemini in parallel before finalizing. Note where outputs diverge and why.
For clients: In the next session, have each client name the AI they use most and identify one type of thinking it likely underperforms on. Assign a multi-model experiment for one decision they’re sitting on.

PowerUp Coaching — Living Knowledge Base

Explorer