PRD — Capability-Architect

§1 Problem Statement

What fails without this: Asking an AI to “build me an agent / writing team” produces a single mega-prompt that’s hard to audit, hard to optimize, and not reusable. There’s no disciplined process that decomposes a workflow into the right execution shape (inline / fork / spawn / parallel), routes model + effort per step, defines what each child returns, and packages the result as an inheritable skill bundle.

Transcript evidence:

“This is different from ‘make me an agent.’ It is a disciplined design process for capability creation.” — Lou

§2 Trigger Surface

Should fire on (include indirect cases):

  • “Build me a writing team / onboarding pipeline / research workflow as a reusable skill.”
  • “Turn this client onboarding process into a reusable AI workflow.”
  • “Compile this workflow into a skill bundle I can inherit in my projects.”
  • “I have a multi-step process I keep doing by hand — make it a capability.”
  • “Design an agent that forks these stages and routes models per step.” (indirect — describes the compile output without naming the tool)

Should NOT fire on (near-misses):

  • “Write me a prompt for X.” (single-shot command — no decomposition needed → direct command, not a compiled capability)
  • “Which model should I use for this article?” (routing question → inference-router, not the full compiler)
  • “Run my writing team on this draft.” (executing an existing capability, not building one)
  • “Audit this plan.” (evaluation lens, not capability construction)

§3 User Journey (Happy Path)

  1. User describes a workflow, problem, or existing pipeline (“build me a writing team”).
  2. Intake — architect states the job in one sentence; if it can’t, it asks clarifying questions.
  3. Dependency map (DAG) — identifies which artifacts depend on which (draft←brief, revision←review, polish←revised draft).
  4. Classification — labels each step code / inference / hybrid.
  5. Bundling — groups steps whose intermediate outputs are only used internally.
  6. Execution — decides inline / fork / spawn / parallel per step (or bundle).
  7. Routing — delegates to inference-router to assign model + effort + rationale per component.
  8. Contract — defines what each child returns to the parent (JSON envelope + artifact path).
  9. Generation — writes the actual skill bundle + local assets.
  10. Evaluation — defines trigger fire / no-fire cases.
  11. Install — makes the bundle inheritable by folders (publish as a plugin per Insight - Plugins Are How You Share Skills — Version-Controlled Capabilities From a Marketplace Repo).

§4 Step Classification

StepTypeJustification
Intake (state the job)inferenceRequires interpreting an open-ended user description into a crisp job statement; unbounded input.
Dependency map (DAG)hybridIdentifying dependencies is judgment; representing the DAG is code.
Classify steps code/inference/hybridinferenceJudgment about whether a step needs a model; context-dependent.
BundlinghybridDeciding which outputs stay internal is judgment; grouping is mechanical.
Execution shape (inline/fork/spawn/parallel)inferencePer-step context-inheritance judgment (the fork-vs-spawn question).
Routing (model + effort)inferenceDelegated to inference-router; classification + assignment is judgment over a rubric.
Contract definitionhybridChoosing what each child returns is judgment; the schema is code.
Generation (write bundle)hybridTemplated skill scaffolding (code) filled with judgment-derived content (inference).
Evaluation casesinferenceRequires anticipating fire/no-fire phrasings; unbounded.
Install / publish plugincodeDeterministic packaging + marketplace manifest update.

Rule: Every “inference” classification requires a written justification. If you cannot state why code cannot handle a step, reclassify it as code.

§5 Inference Call Contracts

CallInput schemaOutput schemaWhy not code

§6 References Needed

§7 Known Gotchas

  • Forks must be able to run with the context they’re given. Lou’s warning: “you do have to be careful to make sure that these forked processes can run independently with the context you have to date.” If a forked step needs context it didn’t inherit, it starves — the architect must verify each fork’s context sufficiency at design time.
  • Don’t default everything to the strongest model at high effort. The router must assign the least-excessive inference that meets the bar, or the compiled capability silently overpays on every run.
  • Skill-creator inflates front-matter descriptions to the 1024-char max. The generation step must cap descriptions tight (a couple of trigger keywords + a one-to-two-line description) or the installed capability taxes context on every query.

§8 Eval Cases

Trigger Evals

User inputExpectedRationale
”Turn my client onboarding into a reusable AI workflow.”fireMulti-step workflow → compiled capability.
”Build me a writing team I can reuse across projects.”fireThe session’s worked example.
”Write me a prompt to summarize this.”no-fireSingle-shot → command, no compilation.
”Which model should write the draft?“no-fireRouting-only → inference-router.

Output Evals

ScenarioInputExpected output shapePass criterion
Happy path”Build a forked writing team”A skill bundle with orchestrator + named stages (scan/architect/draft/review/polish), each with execution shape + model/effort + return contract, plus eval cases and install stepEach stage has a fork/spawn decision, a model+effort+rationale, and a defined return envelope
Edge caseWorkflow with a step that needs no modelThat step classified as code, not inferenceThe DAG marks it code and no model is assigned

§9 Composition

Assumes loaded: inference-router (routing stage delegates to it, reading a shared model-effort-routing.md).

Potential conflicts:

Routing position:

§10 Success Criteria

  • [Concrete, verifiable criterion]

§11 Out of Scope


Source

  • 2026-06-11_Mastermind (Lou — the ambient-intelligence walkthrough; capability-architect as the compiler that turns a workflow into an inheritable skill bundle)