PR review using parallel specialized agents for code quality, security, testing, architecture, and performance analysis. Synthesizes findings into a review report with conventional comments (praise/issue/suggestion/nitpick) and approve or request-changes verdict. Use when reviewing pull requests, conducting security audits, or validating changes before merge.
The PR number or branch is passed as the skill argument. Resolve it immediately:
PR_NUMBER = "$ARGUMENTS[0]" # e.g., "123" or "feature-branch"# If no argument provided, check environmentif not PR_NUMBER: PR_NUMBER = os.environ.get("ORCHESTKIT_PR_URL", "").split("/")[-1]# If still empty, detect from current branchif not PR_NUMBER: PR_NUMBER = "$(gh pr view --json number -q .number 2>/dev/null)"
Use PR_NUMBER consistently in all subsequent commands and agent prompts.
If the user asks for an "ultra" / "deep" / "thorough" review and the host is on CC ≥ 2.1.120, defer to the native subcommand instead of re-implementing the multi-agent loop in skill instructions:
claude ultrareview "$PR_REF" --json
The CLI runs the same multi-agent review (code-quality, security-auditor, test-coverage, architecture) with structured output and a determinate verdict (approve | comment | request-changes). On CC < 2.1.120 the subcommand doesn't exist — fall back to the parallel-agents path below.
This keeps the skill thin: built-in CLI wins for "ultra" depth; the OrchestKit skill wins for --render-style customization, focused review modes (security-only, perf-only), and offline scenarios.
# memory is alwaysLoad in .mcp.json (CC 2.1.121+, #1541) — probe below kept as fallback for older CC:ToolSearch(query="select:mcp__memory__search_nodes")Write(".claude/chain/capabilities.json", { memory, timestamp })# If memory available: search for past review patterns on these files
CC ≥ 2.1.116 note: the gh calls below can hit GitHub's API rate limit on very active repos. When the Bash tool surfaces a rate-limit hint, stop and wait for reset — do not retry in a loop. See ork:github-operations for the full guidance.
CC ≥ 2.1.119 multi-host note (M122):--from-pr now accepts GitLab MR, Bitbucket PR, and GitHub Enterprise URLs. Detect the host with parsePrUrl from src/hooks/src/lib/pr-host-parser.ts and branch on family for the right CLI:
Family
CLI
github / github-enterprise
gh pr view/diff/checks (with GH_HOST=<enterprise-host> for GHE)
Falls back to github.com when the URL doesn't match any pattern. Custom enterprise hosts: configure prUrlTemplate (see src/skills/configure/). Full pattern: src/skills/chain-patterns/references/pr-from-platform.md.
# Get PR detailsgh pr view $PR_NUMBER --json title,body,files,additions,deletions,commits,author# View the diffgh pr diff $PR_NUMBER# Check CI statusgh pr checks $PR_NUMBER
Spawn all three in ONE message. This cuts context-gathering time by 60%.
For agent-based review (Phase 3), all 6 agents are independent -- launch them together.
</use_parallel_tool_calls>
Before spawning agents, load project-specific review context from memory:
# Load project review context (conventions, known weaknesses, past findings)# This gives agents project-specific knowledge without re-discovering patternsPROJECT_CONTEXT = Read("${MEMORY_DIR}/review-pr-context.md") # Falls back gracefully if missing
All agent prompts receive $\{PROJECT_CONTEXT\} so they know project conventions, security patterns, and known weaknesses from prior reviews.
All agents return findings as JSON (see structured output contract in agent prompt files). This enables automated deduplication, severity sorting, and memory graph persistence in Phase 5.
Output each agent's findings as they complete — don't batch until synthesis.
Focus mode (CC 2.1.101): In focus mode, the user only sees your final message. Include the full review verdict, all findings by severity, and the approve/request-changes recommendation — don't assume they saw per-agent outputs.
Security findings → show blockers and critical issues first
Code quality → show pattern violations, complexity hotspots
Test coverage gaps → show missing test cases
This lets the PR author start addressing blocking issues while remaining agents are still analyzing. Only the final synthesis (Phase 5) requires all agents to have completed.
Partial results (CC 2.1.98): If a review agent fails mid-analysis, synthesize partial findings:
for agent_result in review_results: if "[PARTIAL RESULT]" in agent_result.output: # A security agent that found 2 issues before crashing > no security review findings.extend(parse_findings(agent_result.output)) findings[-1]["partial"] = True # Flag in synthesis # Do NOT re-spawn — partial findings are still valuable
Monitor for CI streaming (CC 2.1.98): Stream CI check output in Phase 4:
Bash(command="gh pr checks $PR_NUMBER --watch 2>&1", run_in_background=true)Monitor(pid=ci_watch_id) # Each status change → notification
Claude Code 2.1.111 ships a built-in /ultrareview — parallel multi-agent deep review (Pro/Max users get 3 free per month). It overlaps this skill's Phase 3 but goes deeper. It's not free, so never fire it by default — offer it only when a trigger justifies the cost, and always ask the user before burning a quota.
Compute whether /ultrareview is warranted from the already-collected PR metadata + agent results:
triggers = []if diff_loc_changed > 500: triggers.append("large_diff")if any(path.startswith(p) for path in changed_files for p in ["auth/", "migrations/", "hooks/", "crypto/", "security/", "payments/"]): triggers.append("sensitive_path")if reviewer_verdicts_disagree(phase_3_results): triggers.append("reviewer_disagreement")if any(label in pr_labels for label in ["release", "hotfix"]): triggers.append("high_stakes_label")
If triggers is empty → skip the gate entirely and proceed to Phase 4. Never mention /ultrareview to the user.
Read session state: Read(".claude/state/ultrareview-usage.json") (may not exist). If month == currentMonth() and skip_session == true, skip the prompt and proceed to Phase 4. Otherwise:
AskUserQuestion(questions=[{ "question": f"This PR triggers /ultrareview (reason: {', '.join(triggers)}). Run it? (Pro/Max: 3 free per month.)", "header": "Ultrareview", "multiSelect": false, "options": [ {"label": "Yes, run ultrareview", "description": "Invoke built-in /ultrareview as a final deep pass. Adds 5–10 min."}, {"label": "No, skip it", "description": "Continue with Phase 4 using existing agent results."}, {"label": "Skip for this session", "description": "Don't ask again until this session ends."} ]}])
Why AskUserQuestion and not a --ultra flag: the user relies on voice, so "yes"/"no"/"skip for session" is speakable whereas flags are not.
This is advisory only — we cannot query Anthropic's real quota. When used_this_month >= 3, the AskUserQuestion text changes the third option to warn: "You may have exhausted the monthly free quota."
Set ORK_DISABLE_ULTRAREVIEW=1 or .claude/settings.json → "ork.disableUltrareview": true to skip the gate entirely regardless of triggers. Honored at the top of this phase.
After synthesis, persist critical/high findings to the memory graph for cross-session learning. The Phase 8c verdict writeback (below) handles this automatically when yg-mcp-core>=0.3.0 is installed; for interactive sessions, see references/memory-persistence.md for the manual mcp__memory__create_entities + mcp__memory__add_observations pattern.
After the verdict is submitted, optionally invoke scripts/verdict_writeback.py <review-dir> to persist the verdict + findings to the memory MCP knowledge graph. Self-skips on every non-happy-path so it never breaks the review:
Auto-skip conditions (all exit 0, all WARN-logged):
Skip reason
Trigger
signal absent
verdict missing OR not in \{approve, request-changes, comment\}
yg-mcp-core not importable
yg-mcp-core>=0.3.0 not installed (orchestkit is public; yg-mcp-core lives on private pypi.yonyon.ai — HQ-only)
memory MCP unreachable
MCP server down OR .mcp.json doesn't define memory
Review dir must contain review-output.json (with verdict, repo, pr_number, optional findings: [\{level, msg\}], optional changed_paths: list[str]). Handoff JSON at <review-dir>/verdict-writeback.json records status (fired / skipped) + the constructed entity_name (review::<repo>#<n>@<ts>).
Mirrors the /ork:assess memory_writeback pattern from PR #1889. Closes orchestkit#1894.
CC bundles /code-review (renamed from /simplify): a single-pass correctness-bug check at a chosen effort level, with --comment to post findings as inline PR comments. Use it for a fast, focused "are there bugs in this diff?" pass.
Reach for /ork:review-pr instead when you want the full multi-agent review — parallel code-quality, security, testing, architecture, and performance passes synthesized into conventional comments with an approve / request-changes verdict. They are complementary, not redundant: /code-review is the quick correctness gate; /ork:review-pr is the thorough pre-merge audit.
Every agent MUST return a JSON block (fenced with json) at the end of their review matching the schema in review-pr-output.md. Category prefixes: SEC, PERF, BUG, MAINT, A11Y, TEST.
# DOMAIN-AWARE AGENT SELECTION# Core agents (always spawn): quality-reviewer, security-reviewer, test-reviewer# Conditional: backend-reviewer (if HAS_BACKEND), frontend-reviewer (if HAS_FRONTEND)# Capture scope from Phase 1CHANGED_FILES = "$(gh pr diff $PR_NUMBER --name-only)"TeamCreate(team_name="review-pr-$PR_NUMBER", description="Review PR #$PR_NUMBER")Agent(subagent_type="code-quality-reviewer", name="quality-reviewer", team_name="review-pr-$PR_NUMBER", prompt="""Review code quality and type safety for PR #$PR_NUMBER. ## Project Context ${PROJECT_CONTEXT} Scope: ONLY review the following changed files: ${CHANGED_FILES} Do NOT explore beyond these files. When you find patterns that overlap with security concerns, message security-reviewer with the finding. When you find test gaps, message test-reviewer. Return findings as a JSON block (```json```) with category prefix MAINT.""")Agent(subagent_type="security-auditor", name="security-reviewer", team_name="review-pr-$PR_NUMBER", prompt="""Security audit for PR #$PR_NUMBER. ## Project Context ${PROJECT_CONTEXT} Scope: ONLY review the following changed files: ${CHANGED_FILES} Do NOT explore beyond these files. Check: fail-closed auth, SSRF on user-controlled URLs, rate limiting, secrets in diff. Cross-reference with quality-reviewer for injection risks in code patterns. When you find issues, message the responsible reviewer (backend-reviewer for API issues, frontend-reviewer for XSS). Return findings as a JSON block (```json```) with category prefix SEC.""")Agent(subagent_type="test-generator", name="test-reviewer", team_name="review-pr-$PR_NUMBER", prompt="""Review TEST ADEQUACY for PR #$PR_NUMBER. Scope: ONLY review the following changed files: ${CHANGED_FILES} Do NOT explore beyond these files. 1. Check: Does the PR add/modify code WITHOUT adding tests? Flag as MISSING. 2. Match change types to required test types (testing-unit/testing-e2e/testing-integration rules): - API → integration-api, verification-contract - DB → integration-database, data-seeding-cleanup - UI → unit-aaa-pattern, a11y-testing - Logic → verification-techniques 3. Evaluate test quality: meaningful assertions, no flaky patterns. 4. When quality-reviewer flags test gaps, verify and suggest specific tests. Message backend-reviewer or frontend-reviewer with test requirements. ## Project Context ${PROJECT_CONTEXT} Return findings as a JSON block (```json```) with category prefix TEST.""")# Only spawn if backend files detected (HAS_BACKEND)Agent(subagent_type="backend-system-architect", name="backend-reviewer", team_name="review-pr-$PR_NUMBER", prompt="""Review backend code for PR #$PR_NUMBER. ## Project Context ${PROJECT_CONTEXT} Scope: ONLY review the following changed files: ${CHANGED_FILES} Do NOT explore beyond these files. Check: Redis connection lifecycle, webhook auth (fail-closed), N+1 queries, async patterns. When security-reviewer flags API issues, validate and suggest fixes. Share API pattern findings with frontend-reviewer for consistency. Return findings as a JSON block (```json```) with prefixes BUG/PERF/MAINT.""")# Only spawn if frontend files detected (HAS_FRONTEND)Agent(subagent_type="frontend-ui-developer", name="frontend-reviewer", team_name="review-pr-$PR_NUMBER", prompt="""Review frontend code for PR #$PR_NUMBER. ## Project Context ${PROJECT_CONTEXT} Scope: ONLY review the following changed files: ${CHANGED_FILES} Do NOT explore beyond these files. Check: SSR safety (no navigator/window outside hooks), button type attrs, a11y. When backend-reviewer shares API patterns, verify frontend matches. When security-reviewer flags XSS risks, validate and suggest fixes. Return findings as a JSON block (```json```) with prefixes A11Y/PERF/BUG.""")
Team teardown after synthesis (only shut down agents that were actually spawned):
# After collecting all findings and producing the review# Core agents — always shut downSendMessage(type="shutdown_request", recipient="quality-reviewer", content="Review complete")SendMessage(type="shutdown_request", recipient="security-reviewer", content="Review complete")SendMessage(type="shutdown_request", recipient="test-reviewer", content="Review complete")# Conditional agents — only shut down if spawned# if HAS_BACKEND:SendMessage(type="shutdown_request", recipient="backend-reviewer", content="Review complete")# if HAS_FRONTEND:SendMessage(type="shutdown_request", recipient="frontend-reviewer", content="Review complete")TeamDelete()# Worktree cleanup (CC 2.1.72)ExitWorktree(action="keep")
Before spawning agents, load project-specific review context if it exists:
# Load project review context from memory (if available)# This file contains project conventions, security patterns, and known weaknesses# from prior reviews. Agents receive it as PROJECT_CONTEXT in their prompts.PROJECT_CONTEXT = ""try: Read("${MEMORY_DIR}/review-pr-context.md") # ${MEMORY_DIR} = project memory path PROJECT_CONTEXT = "<result from read>"except: PROJECT_CONTEXT = "No project-specific review context available."
The lead reviewer collects all agent JSON outputs, deduplicates by file+line+category (keeps highest severity), and persists critical/high findings to the memory graph.
# DOMAIN-AWARE AGENT SELECTION# Only spawn agents relevant to detected domains.# CHANGED_FILES and domain flags (HAS_FRONTEND, HAS_BACKEND, HAS_AI)# are captured in Phase 1.# ALWAYS spawn these 4 core agents:# - code-quality-reviewer (readability)# - code-quality-reviewer (type safety)# - security-auditor# - test-generator# CONDITIONALLY spawn these based on domain:# - backend-system-architect → only if HAS_BACKEND# - frontend-ui-developer → only if HAS_FRONTEND# - llm-integrator (7th) → only if HAS_AI# PARALLEL - All agents in ONE messageAgent( description="Review code quality", subagent_type="code-quality-reviewer", prompt="""# Cache-optimized: stable content first (CC 2.1.73) CODE QUALITY REVIEW ## Project Context ${PROJECT_CONTEXT} Review code readability and maintainability: 1. Naming conventions and clarity 2. Function/method complexity (cyclomatic < 10) 3. DRY violations and code duplication 4. SOLID principles adherence Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefix MAINT for maintainability findings. Use conventional comments (praise/suggestion/issue/nitpick). PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)Agent( description="Review type safety", subagent_type="code-quality-reviewer", prompt="""# Cache-optimized: stable content first (CC 2.1.73) TYPE SAFETY REVIEW ## Project Context ${PROJECT_CONTEXT} Review type safety and validation: 1. TypeScript strict mode compliance 2. Zod/Pydantic schema usage 3. No `any` types or type assertions 4. Exhaustive switch/union handling Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefix MAINT for type safety findings. Use conventional comments. PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)Agent( description="Security audit PR", subagent_type="security-auditor", prompt="""# Cache-optimized: stable content first (CC 2.1.73) SECURITY REVIEW ## Project Context ${PROJECT_CONTEXT} Security audit: 1. Secrets/credentials in code 2. Injection vulnerabilities (SQL, XSS) 3. Authentication/authorization checks 4. Dependency vulnerabilities 5. Fail-closed auth patterns (reject when config missing) 6. SSRF protection on user-controlled URLs 7. Rate limiting on auth endpoints Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefix SEC for security findings. Use conventional comments. PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)Agent( description="Review test adequacy", subagent_type="test-generator", prompt="""# Cache-optimized: stable content first (CC 2.1.73) TEST ADEQUACY REVIEW Evaluate whether this PR has sufficient tests: 1. TEST EXISTENCE CHECK - Does the PR add/modify code WITHOUT adding/updating tests? - Are there changed files with 0 corresponding test files? - Flag: "MISSING" if code changes have no tests at all 2. TEST TYPE MATCHING (use testing-unit/testing-e2e/testing-integration rules) Match changed code to required test types: - API endpoint changes → need integration tests (rule: integration-api) - DB schema changes → need migration + integration tests (rule: integration-database) - UI component changes → need unit + a11y tests (rule: unit-aaa-pattern, a11y-testing) - Business logic → need unit + property tests (rule: verification-techniques) - LLM/AI changes → need eval tests (rule: llm-evaluation) 3. TEST QUALITY - Meaningful assertions (not just truthy/exists) - Edge cases and error paths covered - No flaky patterns (timing, external deps, random) - Mocking is appropriate (not over-mocked) 4. COVERAGE GAPS - Which changed functions/methods lack test coverage? - Which error paths are untested? ## Project Context ${PROJECT_CONTEXT} Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefix TEST for testing findings. Use conventional comments. PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)Agent( description="Review backend code", subagent_type="backend-system-architect", prompt="""# Cache-optimized: stable content first (CC 2.1.73) BACKEND REVIEW ## Project Context ${PROJECT_CONTEXT} Review backend code: 1. API design and REST conventions 2. Async/await patterns and error handling 3. Database query efficiency (N+1) 4. Transaction boundaries 5. Redis connection lifecycle (close in try/finally) 6. Webhook auth patterns (fail-closed) Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefixes: BUG (correctness), PERF (performance), MAINT (maintainability). Use conventional comments. PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)Agent( description="Review frontend code", subagent_type="frontend-ui-developer", prompt="""# Cache-optimized: stable content first (CC 2.1.73) FRONTEND REVIEW ## Project Context ${PROJECT_CONTEXT} Review frontend code: 1. React 19 patterns (hooks, server components) 2. State management correctness 3. Accessibility (a11y) compliance — button type attrs, ARIA 4. Performance (memoization, lazy loading) 5. SSR safety — no navigator/window outside hooks/useEffect Do NOT explore beyond the changed files listed below. Focus your analysis on the diff. Return your findings as a JSON block (```json```) matching the structured output contract above. Use category prefixes: A11Y (accessibility), PERF (performance), BUG (correctness). Use conventional comments. PR: $PR_NUMBER Scope: ONLY review the following changed files: ${CHANGED_FILES} """, run_in_background=True, max_turns=25)
Incorrect — Sequential agents:
# 6 reviewers run one-by-one (slow)Agent(subagent_type="code-quality-reviewer", prompt="...")# Wait for completionAgent(subagent_type="security-auditor", prompt="...")# Wait again...
Correct — Parallel agents:
# All 6 agents in ONE message (fast)Agent(subagent_type="code-quality-reviewer", prompt="...", run_in_background=True)Agent(subagent_type="security-auditor", prompt="...", run_in_background=True)Agent(subagent_type="test-generator", prompt="...", run_in_background=True)# All launch simultaneously
The Phase 8c verdict writeback script (scripts/verdict_writeback.py) handles this automatically when yg-mcp-core>=0.3.0 is installed. Use the manual pattern below when running an interactive review on a host that does NOT have yg-mcp-core (the script will skip cleanly in that case and you can fall back to direct memory MCP calls).
## 🔄 Changes RequestedGood progress, but a few items need addressing before merge.### Must Fix1. [blocker 1]2. [blocker 2]### Suggestions- [optional improvements]🤖 Reviewed with Claude Code (6 parallel agents)
praise: Excellent use of the repository pattern here - clean separation of concerns.nitpick: Consider using a more descriptive variable name than `d` - maybe `data` or `response`.suggestion: This loop could be replaced with a list comprehension for better readability.issue: This SQL query is vulnerable to injection - use parameterized queries instead.question: Is there a reason we're not using the existing `UserService` here?
cd backendpoetry run ruff format --check app/poetry run ruff check app/poetry run pytest tests/unit/ -v --tb=shortpoetry run pytest tests/ -v --cov=app --cov-report=term-missing
# Detect real service testing capabilityls **/docker-compose*.yml 2>/dev/nullls **/testcontainers* 2>/dev/null# If detected, run integration tests against real servicesdocker-compose -f docker-compose.test.yml up -dpoetry run pytest tests/integration/ -vdocker-compose -f docker-compose.test.yml down
# List changed files without corresponding test filesgh pr diff $ARGUMENTS --name-only | while read f; do # Skip test files, configs, docs case "$f" in tests/*|*test*|*.md|*.json|*.yml) continue ;; esac # Check if a test file exists test_file="tests/$(basename "$f" .py)_test.py" if [ ! -f "$test_file" ]; then echo "NO TEST: $f" fidone