Auto
Intent-classified router — the front door to OrchestKit. Takes a plain-English goal, classifies it into one intent category, and routes to the right specialist skill (/ork:fix-issue, /ork:cover, /ork:brainstorm, /ork:implement, /ork:review-pr, /ork:verify, a /goal optimization loop, or the skill-evolution gate). Use when you describe a goal not a method, when the right skill is unclear, or when you want the agent to pick the approach. Triggers on: auto, do this, figure out, just make, get it to, I want, help me, not sure which.
/ork:autoAuto Intent-classified router — the front door to OrchestKit. Takes a plain-English goal, classifies it into one intent category, and routes to the right specialist skill (/ork:fix-issue, /ork:cover, /ork:brainstorm, /ork:implement, /ork:review-pr, /ork:verify, a /goal optimization loop, or the skill-evolution gate).
/ork:auto — Intent Router
The front door to OrchestKit. You describe a goal in plain English; the router classifies it and hands off to the right specialist. One entry point, many execution paths.
Why this exists: OrchestKit has 112 skills, but usage telemetry shows users fire only the handful they can name by memory (10 distinct skills across thousands of sessions). The dominant cause of "dead" skills is no front door — not low quality. This router turns "you must know the exact
/ork:<name>" into "describe what you want."
Core principle: routing is a deterministic workflow, not an autonomous agent (Anthropic, Building Effective Agents). Classify → confirm → hand off. The router never does the work itself — it picks who does.
When to use vs. go direct
Use /ork:auto when… | Go direct when… |
|---|---|
| You describe a goal, not a method | You already know the skill (/ork:cover) |
| The right skill isn't obvious | The request maps unambiguously to one |
| You want the agent to choose | You're chaining a known workflow |
Intent categories → OrchestKit skill
| intent | signal words | routes to |
|---|---|---|
| fix | fix, debug, broken, failing, error, crash, regression | /ork:fix-issue |
| diagnose | why, why isn't, why does, why can't, investigate | /ork:fix-issue (investigation-first) |
| optimize | faster, reduce, latency, bundle, minimize, below N ms | a /goal optimization loop (see Gaps) |
| cover | coverage, untested, get to N% | /ork:cover --target N |
| design | design, architect, how should we, explore, idea | /ork:brainstorm |
| build | build, implement, create, add feature, from ticket | /ork:implement |
| review | review, PR, MR, pull request, #N | /ork:review-pr |
| verify | verify, check, make sure, passes, green | /ork:verify |
| improve-skill | improve the skill, optimize the prompt, SKILL.md | the skill-evolution / holdout gate (see Gaps) |
| (fallback) | no confident category | clarify with ONE question |
Full per-category parameter extraction + edge cases: references/routing-rules.md.
The flow
CLASSIFY -> CONFIRM -> HAND OFF
| | |
reason show the invoke the
out loud route target skill;
(CoT) + nod follow ITS phases1. Classify (reason out loud first)
State your reasoning before committing to a route — this triggers chain-of-thought and is the single biggest accuracy lever (Anthropic, Writing Effective Tools for Agents). Example: "'get latency under 200ms' names a metric + a direction → optimize, not fix."
Apply the disambiguation rules (most specific wins; explicit verb beats inferred intent). The load-bearing one: explicit verb wins — "Fix the slow query" → fix, not optimize. For the full ordered ruleset (all 7, including the truly-ambiguous fallback), references/routing-rules.md is canonical.
2. Confirm (low ceremony)
Show the chosen route in one line and get a nod before handing off:
Goal: "{original goal}"
Intent: {category}
Route: {/ork:skill or loop} {extracted args}
[run] · [adjust] · [cancel]For low-risk single-pass routes (verify, review), an inline "routing you to /ork:verify — ok?" is enough. Never hand off without a nod.
3. Hand off
Invoke the target skill with the extracted parameters and follow that skill's own phases and guardrails — do not override them. The router's job ends at the handoff; the specialist owns execution and its own report.
Fallback + honest gaps
- Fallback category. If no category clears a confident threshold, ask exactly ONE clarifying question rather than guessing. A rising fallback rate is the leading indicator that the taxonomy needs work — surface it, don't bury it.
optimizehas no dedicated skill (yet). OrchestKit's metric-driven optimization runs as a/goalloop using the loop recipe library (/ork:prd-to-goal→references/recipe-library.md). Routeoptimizethere and say so plainly — don't pretend a/ork:experimentskill exists.improve-skillroutes to the evolution gate. Self-optimizing aSKILL.mdgoes through the champion/challenger holdout-promotion gate (/ork:assessevals +evolution-engine), not a one-shot edit. It requires a benchmark + holdout set first.
Guardrails
- No recursion.
/ork:automust not route to itself, directly or via a spawned agent. - No bypass. Routing does not skip the target skill's guardrails, readonly enforcement, or confirmation steps.
- Classification quality is the whole job. A misroute that fails silently is worse than a fallback question. When two categories are equally plausible, ask — don't gamble.
Validation
Routing accuracy is gateable, not vibes. routing-benchmark.json holds 50 labeled goal → category pairs (easy + genuinely ambiguous). Validate after any change to the category table or disambiguation rules:
# isolated classification check via the bare-eval harness
/ork:bare-eval # grade router output against routing-benchmark.jsonTarget ≥95% category accuracy; track the fallback rate as a degradation alarm as the skill library grows.
References
references/routing-rules.md— per-category parameter extraction, edge cases, disambiguationrouting-benchmark.json— 50 labeled goal→category pairs for accuracy validation
Related skills
/ork:help— static categorized directory (browse, don't route)/ork:prd-to-goal— decompose a spec into a/goalline (theoptimizeroute's engine)/ork:fix-issue·/ork:cover·/ork:brainstorm·/ork:implement·/ork:review-pr·/ork:verify— the route targets/ork:assess— champion/challenger holdout gate (theimprove-skillroute)
References (1)
Routing Rules
Routing Rules
Per-category parameter extraction + edge cases. Read during the Classify step to configure the target skill correctly.
fix → /ork:fix-issue
- Extract: bug description (full goal), target files if named, ticket/issue
#N, quoted error message. - Invoke:
/ork:fix-issue \{description or #N\} - Edges: "fix the tests" is
fix(repair broken tests), notcover(add new tests). "fix performance" is ambiguous → ask: debug a specific issue, or optimize a metric?
diagnose → /ork:fix-issue (investigation-first)
- A "why…" question is a gentler entry than a fix command. Frame the plan as observe → hypothesize → propose, then offer to apply the fix.
- A "why…" question is ALWAYS
diagnose, even when it names a failure ("why isn't the build green", "why does the API return 500", "why can't users log in"). The question form is what makes itdiagnose— without one, a statement of breakage or a repair imperative isfix("there's a regression in checkout", "resolve the 500 errors on /api/users"). - Invoke:
/ork:fix-issue \{question\}with an investigation framing. ("I'll investigate first, then propose a fix — ok?")
optimize → /goal loop (no dedicated skill)
- Extract: metric (latency/throughput/bundle/memory), direction (minimize for size/time/cost; maximize for score/rate), goal value + unit, target files.
- Invoke: compose a
/goalloop via/ork:prd-to-goal→references/recipe-library.md. Be explicit that this is a/goal-driven loop, not a/ork:experimentskill (which doesn't exist). - Edges: "make it faster" with no metric → ask what to measure (response time? build time? bundle?). Multiple metrics → pick the emphasized one, note the rest as constraints.
cover → /ork:cover
- Extract: target % ("90%" → 90, "above 85" → 85), scope ("the auth module" →
src/auth/). - Invoke:
/ork:cover --target \{N\} - Edges: "write more tests" with no target → ask the target %. "test the new feature" is
build/verify(functional tests), notcover(coverage %). A surface called out as "untested" iscovereven with no % target ("the payments service is untested, fix that") — the "fix" there repairs a coverage gap, not a bug; ask the target % at invoke time.
design → /ork:brainstorm
- Extract: topic (full goal). Deep mode if the goal says "thorough/comprehensive/deep dive" or spans multiple systems.
- Invoke:
/ork:brainstorm \{topic\} - Edges: "how should we…" is
design, notbuild. "Design AND build…" → startdesign, offerbuildafter.
build → /ork:implement
- Extract: feature description, ticket ID, mode (greenfield/brownfield/refactor/bugfix).
- Invoke:
/ork:implement \{description\} - Edges: "implement the design from the brainstorm" → check for recent brainstorm state first.
review → /ork:review-pr
- Extract: PR/MR number (
#123→ 123), scope filter if named. - Invoke:
/ork:review-pr \{number or branch\} - Edges: "review my code" with no PR → ask which PR/branch. "review the design" is
design, notreview.
verify → /ork:verify
- Extract: checks (tests/lint/typecheck/all), scope.
- Invoke:
/ork:verify - Edges: "make sure it works" → all checks. "check tests pass" → tests-focused.
improve-skill → skill-evolution / holdout gate
- Extract: which
SKILL.md, quality metric (else task-completion against test cases). - Invoke: the champion/challenger holdout-promotion gate (
/ork:assessevals +evolution-engine). Requires a benchmark + holdout set to exist first — if missing, help the user define 5–10 cases before looping. - Edges: "optimize my prompt" (not a skill file) → route to
optimizewith the prompt as the target. When the improvement target IS a skill or agent — a SKILL.md, a named skill, an agent prompt — it isimprove-skillregardless of the verb: "optimize the prompt for the security-auditor skill" and "make the brainstorm SKILL.md produce better ideas" are bothimprove-skill, notoptimize/design.
Disambiguation (when multiple categories match)
- Explicit verb wins. "Fix the slow query" →
fix. - Metric + direction →
optimize. - Percentage in a test context →
cover. - Question form →
design("how should") ordiagnose("why") — and this beats symptom words: "why isn't the build green" isdiagnose, notfix. - PR/MR/
#N→review. - Ticket reference →
build. - Truly ambiguous → ask ONE question naming the two candidate routes. Do not guess when two DIFFERENT skills are plausible and the goal names no artifact, metric, or number. Canonical fallbacks: "fix performance on the dashboard" (fix vs optimize), "review my code" with no PR/branch named (review vs verify), "test the new feature" (cover vs verify vs build), "the queries are slow and maybe vulnerable to injection" (optimize vs review).
Audit Skills
Audits all OrchestKit skills for quality, completeness, and compliance with authoring standards. Use when checking skill health, before releases, or after bulk skill edits to surface SKILL.md files that are too long, have missing frontmatter, lack rules/references, or are unregistered in manifests.
Bare Eval
Run isolated eval and grading calls using CC 2.1.81 --bare mode. Constructs claude -p --bare invocations for skill evaluation, trigger testing, and LLM grading without plugin/hook interference. Use when running eval pipelines, grading skill outputs, benchmarking prompt quality, or testing trigger accuracy in isolation.
Last updated on