Skip to main content
OrchestKit v8.63.2 — 113 skills, 37 agents, 212 hooks · Claude Code 2.1.183+
OrchestKit
Skills

Auto

Intent-classified router — the front door to OrchestKit. Takes a plain-English goal, classifies it into one intent category, and routes to the right specialist skill (/ork:fix-issue, /ork:cover, /ork:brainstorm, /ork:implement, /ork:review-pr, /ork:verify, a /goal optimization loop, or the skill-evolution gate). Use when you describe a goal not a method, when the right skill is unclear, or when you want the agent to pick the approach. Triggers on: auto, do this, figure out, just make, get it to, I want, help me, not sure which.

Command medium
Invoke
/ork:auto

Auto Intent-classified router — the front door to OrchestKit. Takes a plain-English goal, classifies it into one intent category, and routes to the right specialist skill (/ork:fix-issue, /ork:cover, /ork:brainstorm, /ork:implement, /ork:review-pr, /ork:verify, a /goal optimization loop, or the skill-evolution gate).

/ork:auto — Intent Router

The front door to OrchestKit. You describe a goal in plain English; the router classifies it and hands off to the right specialist. One entry point, many execution paths.

Why this exists: OrchestKit has 112 skills, but usage telemetry shows users fire only the handful they can name by memory (10 distinct skills across thousands of sessions). The dominant cause of "dead" skills is no front door — not low quality. This router turns "you must know the exact /ork:<name>" into "describe what you want."

Core principle: routing is a deterministic workflow, not an autonomous agent (Anthropic, Building Effective Agents). Classify → confirm → hand off. The router never does the work itself — it picks who does.

When to use vs. go direct

Use /ork:auto when…Go direct when…
You describe a goal, not a methodYou already know the skill (/ork:cover)
The right skill isn't obviousThe request maps unambiguously to one
You want the agent to chooseYou're chaining a known workflow

Intent categories → OrchestKit skill

intentsignal wordsroutes to
fixfix, debug, broken, failing, error, crash, regression/ork:fix-issue
diagnosewhy, why isn't, why does, why can't, investigate/ork:fix-issue (investigation-first)
optimizefaster, reduce, latency, bundle, minimize, below N msa /goal optimization loop (see Gaps)
covercoverage, untested, get to N%/ork:cover --target N
designdesign, architect, how should we, explore, idea/ork:brainstorm
buildbuild, implement, create, add feature, from ticket/ork:implement
reviewreview, PR, MR, pull request, #N/ork:review-pr
verifyverify, check, make sure, passes, green/ork:verify
improve-skillimprove the skill, optimize the prompt, SKILL.mdthe skill-evolution / holdout gate (see Gaps)
(fallback)no confident categoryclarify with ONE question

Full per-category parameter extraction + edge cases: references/routing-rules.md.

The flow

  CLASSIFY  ->  CONFIRM  ->  HAND OFF
     |            |             |
  reason       show the     invoke the
  out loud     route        target skill;
  (CoT)        + nod        follow ITS phases

1. Classify (reason out loud first)

State your reasoning before committing to a route — this triggers chain-of-thought and is the single biggest accuracy lever (Anthropic, Writing Effective Tools for Agents). Example: "'get latency under 200ms' names a metric + a direction → optimize, not fix."

Apply the disambiguation rules (most specific wins; explicit verb beats inferred intent). The load-bearing one: explicit verb wins — "Fix the slow query" → fix, not optimize. For the full ordered ruleset (all 7, including the truly-ambiguous fallback), references/routing-rules.md is canonical.

2. Confirm (low ceremony)

Show the chosen route in one line and get a nod before handing off:

Goal:   "{original goal}"
Intent: {category}
Route:  {/ork:skill or loop} {extracted args}
        [run] · [adjust] · [cancel]

For low-risk single-pass routes (verify, review), an inline "routing you to /ork:verify — ok?" is enough. Never hand off without a nod.

3. Hand off

Invoke the target skill with the extracted parameters and follow that skill's own phases and guardrails — do not override them. The router's job ends at the handoff; the specialist owns execution and its own report.

Fallback + honest gaps

  • Fallback category. If no category clears a confident threshold, ask exactly ONE clarifying question rather than guessing. A rising fallback rate is the leading indicator that the taxonomy needs work — surface it, don't bury it.
  • optimize has no dedicated skill (yet). OrchestKit's metric-driven optimization runs as a /goal loop using the loop recipe library (/ork:prd-to-goalreferences/recipe-library.md). Route optimize there and say so plainly — don't pretend a /ork:experiment skill exists.
  • improve-skill routes to the evolution gate. Self-optimizing a SKILL.md goes through the champion/challenger holdout-promotion gate (/ork:assess evals + evolution-engine), not a one-shot edit. It requires a benchmark + holdout set first.

Guardrails

  • No recursion. /ork:auto must not route to itself, directly or via a spawned agent.
  • No bypass. Routing does not skip the target skill's guardrails, readonly enforcement, or confirmation steps.
  • Classification quality is the whole job. A misroute that fails silently is worse than a fallback question. When two categories are equally plausible, ask — don't gamble.

Validation

Routing accuracy is gateable, not vibes. routing-benchmark.json holds 50 labeled goal → category pairs (easy + genuinely ambiguous). Validate after any change to the category table or disambiguation rules:

# isolated classification check via the bare-eval harness
/ork:bare-eval   # grade router output against routing-benchmark.json

Target ≥95% category accuracy; track the fallback rate as a degradation alarm as the skill library grows.

References

  • references/routing-rules.md — per-category parameter extraction, edge cases, disambiguation
  • routing-benchmark.json — 50 labeled goal→category pairs for accuracy validation
  • /ork:help — static categorized directory (browse, don't route)
  • /ork:prd-to-goal — decompose a spec into a /goal line (the optimize route's engine)
  • /ork:fix-issue · /ork:cover · /ork:brainstorm · /ork:implement · /ork:review-pr · /ork:verify — the route targets
  • /ork:assess — champion/challenger holdout gate (the improve-skill route)

References (1)

Routing Rules

Routing Rules

Per-category parameter extraction + edge cases. Read during the Classify step to configure the target skill correctly.

fix → /ork:fix-issue

  • Extract: bug description (full goal), target files if named, ticket/issue #N, quoted error message.
  • Invoke: /ork:fix-issue \{description or #N\}
  • Edges: "fix the tests" is fix (repair broken tests), not cover (add new tests). "fix performance" is ambiguous → ask: debug a specific issue, or optimize a metric?

diagnose → /ork:fix-issue (investigation-first)

  • A "why…" question is a gentler entry than a fix command. Frame the plan as observe → hypothesize → propose, then offer to apply the fix.
  • A "why…" question is ALWAYS diagnose, even when it names a failure ("why isn't the build green", "why does the API return 500", "why can't users log in"). The question form is what makes it diagnose — without one, a statement of breakage or a repair imperative is fix ("there's a regression in checkout", "resolve the 500 errors on /api/users").
  • Invoke: /ork:fix-issue \{question\} with an investigation framing. ("I'll investigate first, then propose a fix — ok?")

optimize → /goal loop (no dedicated skill)

  • Extract: metric (latency/throughput/bundle/memory), direction (minimize for size/time/cost; maximize for score/rate), goal value + unit, target files.
  • Invoke: compose a /goal loop via /ork:prd-to-goalreferences/recipe-library.md. Be explicit that this is a /goal-driven loop, not a /ork:experiment skill (which doesn't exist).
  • Edges: "make it faster" with no metric → ask what to measure (response time? build time? bundle?). Multiple metrics → pick the emphasized one, note the rest as constraints.

cover → /ork:cover

  • Extract: target % ("90%" → 90, "above 85" → 85), scope ("the auth module" → src/auth/).
  • Invoke: /ork:cover --target \{N\}
  • Edges: "write more tests" with no target → ask the target %. "test the new feature" is build/verify (functional tests), not cover (coverage %). A surface called out as "untested" is cover even with no % target ("the payments service is untested, fix that") — the "fix" there repairs a coverage gap, not a bug; ask the target % at invoke time.

design → /ork:brainstorm

  • Extract: topic (full goal). Deep mode if the goal says "thorough/comprehensive/deep dive" or spans multiple systems.
  • Invoke: /ork:brainstorm \{topic\}
  • Edges: "how should we…" is design, not build. "Design AND build…" → start design, offer build after.

build → /ork:implement

  • Extract: feature description, ticket ID, mode (greenfield/brownfield/refactor/bugfix).
  • Invoke: /ork:implement \{description\}
  • Edges: "implement the design from the brainstorm" → check for recent brainstorm state first.

review → /ork:review-pr

  • Extract: PR/MR number (#123 → 123), scope filter if named.
  • Invoke: /ork:review-pr \{number or branch\}
  • Edges: "review my code" with no PR → ask which PR/branch. "review the design" is design, not review.

verify → /ork:verify

  • Extract: checks (tests/lint/typecheck/all), scope.
  • Invoke: /ork:verify
  • Edges: "make sure it works" → all checks. "check tests pass" → tests-focused.

improve-skill → skill-evolution / holdout gate

  • Extract: which SKILL.md, quality metric (else task-completion against test cases).
  • Invoke: the champion/challenger holdout-promotion gate (/ork:assess evals + evolution-engine). Requires a benchmark + holdout set to exist first — if missing, help the user define 5–10 cases before looping.
  • Edges: "optimize my prompt" (not a skill file) → route to optimize with the prompt as the target. When the improvement target IS a skill or agent — a SKILL.md, a named skill, an agent prompt — it is improve-skill regardless of the verb: "optimize the prompt for the security-auditor skill" and "make the brainstorm SKILL.md produce better ideas" are both improve-skill, not optimize/design.

Disambiguation (when multiple categories match)

  1. Explicit verb wins. "Fix the slow query" → fix.
  2. Metric + direction → optimize.
  3. Percentage in a test context → cover.
  4. Question form → design ("how should") or diagnose ("why") — and this beats symptom words: "why isn't the build green" is diagnose, not fix.
  5. PR/MR/#Nreview.
  6. Ticket reference → build.
  7. Truly ambiguous → ask ONE question naming the two candidate routes. Do not guess when two DIFFERENT skills are plausible and the goal names no artifact, metric, or number. Canonical fallbacks: "fix performance on the dashboard" (fix vs optimize), "review my code" with no PR/branch named (review vs verify), "test the new feature" (cover vs verify vs build), "the queries are slow and maybe vulnerable to injection" (optimize vs review).
Edit on GitHub

Last updated on