PRD — Capability-Architect

§1 Problem Statement

What fails without this: Asking an AI to “build me an agent / writing team” produces a single mega-prompt that’s hard to audit, hard to optimize, and not reusable. There’s no disciplined process that decomposes a workflow into the right execution shape (inline / fork / spawn / parallel), routes model + effort per step, defines what each child returns, and packages the result as an inheritable skill bundle.

Transcript evidence:

“This is different from ‘make me an agent.’ It is a disciplined design process for capability creation.” — Lou

§2 Trigger Surface

Should fire on (include indirect cases):

“Build me a writing team / onboarding pipeline / research workflow as a reusable skill.”
“Turn this client onboarding process into a reusable AI workflow.”
“Compile this workflow into a skill bundle I can inherit in my projects.”
“I have a multi-step process I keep doing by hand — make it a capability.”
“Design an agent that forks these stages and routes models per step.” (indirect — describes the compile output without naming the tool)

Should NOT fire on (near-misses):

“Write me a prompt for X.” (single-shot command — no decomposition needed → direct command, not a compiled capability)
“Which model should I use for this article?” (routing question → inference-router, not the full compiler)
“Run my writing team on this draft.” (executing an existing capability, not building one)
“Audit this plan.” (evaluation lens, not capability construction)

§3 User Journey (Happy Path)

User describes a workflow, problem, or existing pipeline (“build me a writing team”).
Intake — architect states the job in one sentence; if it can’t, it asks clarifying questions.
Dependency map (DAG) — identifies which artifacts depend on which (draft←brief, revision←review, polish←revised draft).
Classification — labels each step code / inference / hybrid.
Bundling — groups steps whose intermediate outputs are only used internally.
Execution — decides inline / fork / spawn / parallel per step (or bundle).
Routing — delegates to inference-router to assign model + effort + rationale per component.
Contract — defines what each child returns to the parent (JSON envelope + artifact path).
Generation — writes the actual skill bundle + local assets.
Evaluation — defines trigger fire / no-fire cases.
Install — makes the bundle inheritable by folders (publish as a plugin per Insight - Plugins Are How You Share Skills — Version-Controlled Capabilities From a Marketplace Repo).

§4 Step Classification

Step	Type	Justification
Intake (state the job)	inference	Requires interpreting an open-ended user description into a crisp job statement; unbounded input.
Dependency map (DAG)	hybrid	Identifying dependencies is judgment; representing the DAG is code.
Classify steps code/inference/hybrid	inference	Judgment about whether a step needs a model; context-dependent.
Bundling	hybrid	Deciding which outputs stay internal is judgment; grouping is mechanical.
Execution shape (inline/fork/spawn/parallel)	inference	Per-step context-inheritance judgment (the fork-vs-spawn question).
Routing (model + effort)	inference	Delegated to `inference-router`; classification + assignment is judgment over a rubric.
Contract definition	hybrid	Choosing what each child returns is judgment; the schema is code.
Generation (write bundle)	hybrid	Templated skill scaffolding (code) filled with judgment-derived content (inference).
Evaluation cases	inference	Requires anticipating fire/no-fire phrasings; unbounded.
Install / publish plugin	code	Deterministic packaging + marketplace manifest update.

Rule: Every “inference” classification requires a written justification. If you cannot state why code cannot handle a step, reclassify it as code.

§5 Inference Call Contracts

Call	Input schema	Output schema	Why not code

§6 References Needed

§7 Known Gotchas

Forks must be able to run with the context they’re given. Lou’s warning: “you do have to be careful to make sure that these forked processes can run independently with the context you have to date.” If a forked step needs context it didn’t inherit, it starves — the architect must verify each fork’s context sufficiency at design time.
Don’t default everything to the strongest model at high effort. The router must assign the least-excessive inference that meets the bar, or the compiled capability silently overpays on every run.
Skill-creator inflates front-matter descriptions to the 1024-char max. The generation step must cap descriptions tight (a couple of trigger keywords + a one-to-two-line description) or the installed capability taxes context on every query.

§8 Eval Cases

Trigger Evals

User input	Expected	Rationale
”Turn my client onboarding into a reusable AI workflow.”	fire	Multi-step workflow → compiled capability.
”Build me a writing team I can reuse across projects.”	fire	The session’s worked example.
”Write me a prompt to summarize this.”	no-fire	Single-shot → command, no compilation.
”Which model should write the draft?“	no-fire	Routing-only → inference-router.

Output Evals

Scenario	Input	Expected output shape	Pass criterion
Happy path	”Build a forked writing team”	A skill bundle with orchestrator + named stages (scan/architect/draft/review/polish), each with execution shape + model/effort + return contract, plus eval cases and install step	Each stage has a fork/spawn decision, a model+effort+rationale, and a defined return envelope
Edge case	Workflow with a step that needs no model	That step classified as code, not inference	The DAG marks it code and no model is assigned

§9 Composition

Assumes loaded: inference-router (routing stage delegates to it, reading a shared model-effort-routing.md).

Potential conflicts:

Routing position:

§10 Success Criteria

[Concrete, verifiable criterion]

§11 Out of Scope

Source

2026-06-11_Mastermind (Lou — the ambient-intelligence walkthrough; capability-architect as the compiler that turns a workflow into an inheritable skill bundle)

PowerUp Coaching — Living Knowledge Base

Explorer

capability-architect-prd