kin: auto-commit after pipeline

This commit is contained in:
Gros Frumos 2026-03-17 14:03:53 +02:00
parent 04cbbc563b
commit b6f40a6ace
9 changed files with 1690 additions and 16 deletions

View file

@ -0,0 +1,109 @@
You are a Department Head for the Kin multi-agent orchestrator.
Your job: receive a subtask from the Project Manager, plan the work for your department, and produce a structured sub-pipeline for your workers to execute.
## Input
You receive:
- PROJECT: id, name, tech stack
- TASK: id, title, brief
- DEPARTMENT: your department name and available workers
- HANDOFF FROM PREVIOUS DEPARTMENT: artifacts and context from prior work (if any)
- PREVIOUS STEP OUTPUT: may contain handoff summary from a preceding department
## Your responsibilities
1. Analyze the task in context of your department's domain
2. Plan the work as a short pipeline (1-4 steps) using ONLY workers from your department
3. Define a clear, detailed brief for each worker — include what to build, where, and any constraints
4. Specify what artifacts your department will produce (files changed, endpoints, schemas)
5. Write handoff notes for the next department with enough detail for them to continue
## Department-specific guidance
### Backend department (backend_head)
- Plan API design before implementation: architect → backend_dev → tester → reviewer
- Specify endpoint contracts (method, path, request/response schemas) in worker briefs
- Include database schema changes in artifacts
- Ensure tester verifies API contracts, not just happy paths
### Frontend department (frontend_head)
- Reference backend API contracts from incoming handoff
- Plan component hierarchy: frontend_dev → tester → reviewer
- Include component file paths and prop interfaces in artifacts
- Verify UI matches acceptance criteria
### QA department (qa_head)
- Focus on end-to-end verification across departments
- Reference artifacts from all preceding departments
- Plan: tester (functional tests) → reviewer (code quality)
### Security department (security_head)
- Audit scope: OWASP top 10, auth, secrets, input validation
- Plan: security (audit) → reviewer (remediation verification)
- Include vulnerability severity in artifacts
### Infrastructure department (infra_head)
- Plan: sysadmin (investigate/configure) → debugger (if issues found) → reviewer
- Include service configs, ports, versions in artifacts
### Research department (research_head)
- Plan: tech_researcher (gather data) → architect (analysis/recommendations)
- Include API docs, limitations, integration notes in artifacts
### Marketing department (marketing_head)
- Plan: tech_researcher (market research) → spec (positioning/strategy)
- Include competitor analysis, target audience in artifacts
## Rules
- ONLY use workers listed under your department's worker list
- Keep the sub-pipeline SHORT: 1-4 steps maximum
- Always end with `tester` or `reviewer` if they are in your worker list
- Do NOT include other department heads (*_head roles) in sub_pipeline — only workers
- If previous department handoff is provided, acknowledge what was already done and build on it
- Do NOT duplicate work already completed by a previous department
- Write briefs that are self-contained — each worker should understand their task without external context
## Output format
Return ONLY valid JSON (no markdown, no explanation):
```json
{
"status": "done",
"sub_pipeline": [
{
"role": "backend_dev",
"model": "sonnet",
"brief": "Implement the feature as described in the task spec. Expose POST /api/feature endpoint."
},
{
"role": "tester",
"model": "sonnet",
"brief": "Write and run tests for the backend changes. Verify POST /api/feature works correctly."
}
],
"artifacts": {
"files_changed": ["core/models.py", "web/api.py"],
"endpoints_added": ["POST /api/feature"],
"schemas": [],
"notes": "Added feature with full test coverage. All tests pass."
},
"handoff_notes": "Backend implementation complete. Tests passing. Frontend needs to call POST /api/feature with {field: value} body."
}
```
Valid values for `status`: `"done"`, `"blocked"`.
If status is "blocked", include `"blocked_reason": "..."`.
## Blocked Protocol
If you cannot plan the work (task is ambiguous, unclear requirements, outside your department's scope, or missing critical information from previous steps), return:
```json
{"status": "blocked", "blocked_reason": "<clear explanation>", "blocked_at": "<ISO-8601 datetime>"}
```
Use current datetime for `blocked_at`. Do NOT guess — return blocked immediately.

View file

@ -32,6 +32,31 @@ You receive:
- If a task is blocked or unclear, say so — don't guess.
- If `acceptance_criteria` is provided, include it in the brief for the last pipeline step (tester or reviewer) so they can verify the result against it. Do NOT use acceptance_criteria to describe current task state.
## Department routing
For **complex tasks** that span multiple domains, use department heads instead of direct specialists. Department heads (model=opus) plan their own internal sub-pipelines and coordinate their workers.
**Use department heads when:**
- Task requires 3+ specialists across different areas
- Work is clearly cross-domain (backend + frontend + QA, or security + QA, etc.)
- You want intelligent coordination within each domain
**Use direct specialists when:**
- Simple bug fix, hotfix, or single-domain task
- Research or audit tasks
- Pipeline would be 1-2 steps
**Available department heads:**
- `backend_head` — coordinates backend work (architect, backend_dev, tester, reviewer)
- `frontend_head` — coordinates frontend work (frontend_dev, tester, reviewer)
- `qa_head` — coordinates QA (tester, reviewer)
- `security_head` — coordinates security (security, reviewer)
- `infra_head` — coordinates infrastructure (sysadmin, debugger, reviewer)
- `research_head` — coordinates research (tech_researcher, architect)
- `marketing_head` — coordinates marketing (tech_researcher, spec)
Department heads accept model=opus. Each department head receives the brief for their domain and automatically orchestrates their workers with structured handoffs between departments.
## Project type routing
**If project_type == "operations":**

View file

@ -27,6 +27,14 @@ _EXTRA_PATH_DIRS = [
"/usr/local/sbin",
]
# Default timeouts per model (seconds). Override globally with KIN_AGENT_TIMEOUT
# or per role via timeout_seconds in specialists.yaml.
_MODEL_TIMEOUTS = {
"opus": 1800, # 30 min
"sonnet": 1200, # 20 min
"haiku": 600, # 10 min
}
def _build_claude_env() -> dict:
"""Return an env dict with an extended PATH that includes common CLI tool locations.
@ -182,10 +190,22 @@ def run_agent(
if project_path.is_dir():
working_dir = str(project_path)
# Determine timeout: role-specific (specialists.yaml) > model-based > default
role_timeout = None
try:
from core.context_builder import _load_specialists
specs = _load_specialists().get("specialists", {})
role_spec = specs.get(role, {})
if role_spec.get("timeout_seconds"):
role_timeout = int(role_spec["timeout_seconds"])
except Exception:
pass
# Run claude subprocess
start = time.monotonic()
result = _run_claude(prompt, model=model, working_dir=working_dir,
allow_write=allow_write, noninteractive=noninteractive)
allow_write=allow_write, noninteractive=noninteractive,
timeout=role_timeout)
duration = int(time.monotonic() - start)
# Parse output — ensure output_text is always a string for DB storage
@ -247,7 +267,11 @@ def _run_claude(
is_noninteractive = noninteractive or os.environ.get("KIN_NONINTERACTIVE") == "1"
if timeout is None:
timeout = int(os.environ.get("KIN_AGENT_TIMEOUT") or 600)
env_timeout = os.environ.get("KIN_AGENT_TIMEOUT")
if env_timeout:
timeout = int(env_timeout)
else:
timeout = _MODEL_TIMEOUTS.get(model, _MODEL_TIMEOUTS["sonnet"])
env = _build_claude_env()
try:
@ -961,6 +985,187 @@ def _run_learning_extraction(
return {"added": added, "skipped": skipped}
# ---------------------------------------------------------------------------
# Department head detection
# ---------------------------------------------------------------------------
# Cache of roles with execution_type=department_head from specialists.yaml
_DEPT_HEAD_ROLES: set[str] | None = None
def _is_department_head(role: str) -> bool:
"""Check if a role is a department head.
Uses execution_type from specialists.yaml as primary check,
falls back to role.endswith('_head') convention.
"""
global _DEPT_HEAD_ROLES
if _DEPT_HEAD_ROLES is None:
try:
from core.context_builder import _load_specialists
specs = _load_specialists()
all_specs = specs.get("specialists", {})
_DEPT_HEAD_ROLES = {
name for name, spec in all_specs.items()
if spec.get("execution_type") == "department_head"
}
except Exception:
_DEPT_HEAD_ROLES = set()
return role in _DEPT_HEAD_ROLES or role.endswith("_head")
# ---------------------------------------------------------------------------
# Department head sub-pipeline execution
# ---------------------------------------------------------------------------
def _execute_department_head_step(
conn: sqlite3.Connection,
task_id: str,
project_id: str,
parent_pipeline_id: int | None,
step: dict,
dept_head_result: dict,
allow_write: bool = False,
noninteractive: bool = False,
next_department: str | None = None,
) -> dict:
"""Execute sub-pipeline planned by a department head.
Parses the dept head's JSON output, validates the sub_pipeline,
creates a child pipeline in DB, runs it, and saves a handoff record.
Returns dict with success, output, cost_usd, tokens_used, duration_seconds.
"""
raw = dept_head_result.get("raw_output") or dept_head_result.get("output") or ""
if isinstance(raw, (dict, list)):
raw = json.dumps(raw, ensure_ascii=False)
parsed = _try_parse_json(raw)
if not isinstance(parsed, dict):
return {
"success": False,
"output": "Department head returned non-JSON output",
"cost_usd": 0, "tokens_used": 0, "duration_seconds": 0,
}
# Blocked status from dept head
if parsed.get("status") == "blocked":
reason = parsed.get("blocked_reason", "Department head reported blocked")
return {
"success": False,
"output": json.dumps(parsed, ensure_ascii=False),
"blocked": True,
"blocked_reason": reason,
"cost_usd": 0, "tokens_used": 0, "duration_seconds": 0,
}
sub_pipeline = parsed.get("sub_pipeline", [])
if not isinstance(sub_pipeline, list) or not sub_pipeline:
return {
"success": False,
"output": "Department head returned empty or invalid sub_pipeline",
"cost_usd": 0, "tokens_used": 0, "duration_seconds": 0,
}
# Recursion guard: no department head roles allowed in sub_pipeline
for sub_step in sub_pipeline:
if isinstance(sub_step, dict) and _is_department_head(str(sub_step.get("role", ""))):
return {
"success": False,
"output": f"Recursion blocked: sub_pipeline contains _head role '{sub_step['role']}'",
"cost_usd": 0, "tokens_used": 0, "duration_seconds": 0,
}
role = step["role"]
dept_name = role.replace("_head", "")
# Create child pipeline in DB
child_pipeline = models.create_pipeline(
conn, task_id, project_id,
route_type="dept_sub",
steps=sub_pipeline,
parent_pipeline_id=parent_pipeline_id,
department=dept_name,
)
# Build initial context for workers: dept head's plan + artifacts
dept_plan_context = json.dumps({
"department_head_plan": {
"department": dept_name,
"artifacts": parsed.get("artifacts", {}),
"handoff_notes": parsed.get("handoff_notes", ""),
},
}, ensure_ascii=False)
# Run the sub-pipeline (noninteractive=True — Opus already reviewed the plan)
sub_result = run_pipeline(
conn, task_id, sub_pipeline,
dry_run=False,
allow_write=allow_write,
noninteractive=True,
initial_previous_output=dept_plan_context,
)
# Extract decisions from sub-pipeline results for handoff
decisions_made = []
sub_results = sub_result.get("results", [])
for sr in sub_results:
output = sr.get("output") or sr.get("raw_output") or ""
if isinstance(output, str):
try:
output = json.loads(output)
except (json.JSONDecodeError, ValueError):
pass
if isinstance(output, dict):
# Reviewer/tester may include decisions or findings
for key in ("decisions", "findings", "recommendations"):
val = output.get(key)
if isinstance(val, list):
decisions_made.extend(val)
elif isinstance(val, str) and val:
decisions_made.append(val)
# Determine last worker role for auto_complete tracking
last_sub_role = sub_pipeline[-1].get("role", "") if sub_pipeline else ""
# Save handoff for inter-department context
handoff_status = "done" if sub_result.get("success") else "partial"
try:
models.create_handoff(
conn,
pipeline_id=parent_pipeline_id or child_pipeline["id"],
task_id=task_id,
from_department=dept_name,
to_department=next_department,
artifacts=parsed.get("artifacts", {}),
decisions_made=decisions_made,
blockers=[],
status=handoff_status,
)
except Exception:
pass # Handoff save errors must never block pipeline
# Build summary output for the next pipeline step
summary = {
"from_department": dept_name,
"handoff_notes": parsed.get("handoff_notes", ""),
"artifacts": parsed.get("artifacts", {}),
"sub_pipeline_summary": {
"steps_completed": sub_result.get("steps_completed", 0),
"success": sub_result.get("success", False),
},
}
return {
"success": sub_result.get("success", False),
"output": json.dumps(summary, ensure_ascii=False),
"cost_usd": sub_result.get("total_cost_usd", 0),
"tokens_used": sub_result.get("total_tokens", 0),
"duration_seconds": sub_result.get("total_duration_seconds", 0),
"last_sub_role": last_sub_role,
}
# ---------------------------------------------------------------------------
# Pipeline executor
# ---------------------------------------------------------------------------
@ -972,6 +1177,7 @@ def run_pipeline(
dry_run: bool = False,
allow_write: bool = False,
noninteractive: bool = False,
initial_previous_output: str | None = None,
) -> dict:
"""Execute a multi-step pipeline of agents.
@ -980,6 +1186,9 @@ def run_pipeline(
{"role": "tester", "depends_on": "debugger", "brief": "..."},
]
initial_previous_output: context injected as previous_output for the first step
(used by dept head sub-pipelines to pass artifacts/plan to workers).
Returns {success, steps_completed, total_cost, total_tokens, total_duration, results}
"""
# Auth check — skip for dry_run (dry_run never calls claude CLI)
@ -1020,7 +1229,8 @@ def run_pipeline(
total_cost = 0.0
total_tokens = 0
total_duration = 0
previous_output = None
previous_output = initial_previous_output
_last_sub_role = None # Track last worker role from dept sub-pipelines (for auto_complete)
for i, step in enumerate(steps):
role = step["role"]
@ -1283,6 +1493,62 @@ def run_pipeline(
except Exception:
pass # Never block pipeline on decomposer save errors
# Department head: execute sub-pipeline planned by the dept head
if _is_department_head(role) and result["success"] and not dry_run:
# Determine next department for handoff routing
_next_dept = None
if i + 1 < len(steps):
_next_role = steps[i + 1].get("role", "")
if _is_department_head(_next_role):
_next_dept = _next_role.replace("_head", "")
dept_result = _execute_department_head_step(
conn, task_id, project_id,
parent_pipeline_id=pipeline["id"] if pipeline else None,
step=step,
dept_head_result=result,
allow_write=allow_write,
noninteractive=noninteractive,
next_department=_next_dept,
)
# Accumulate sub-pipeline costs
total_cost += dept_result.get("cost_usd") or 0
total_tokens += dept_result.get("tokens_used") or 0
total_duration += dept_result.get("duration_seconds") or 0
if not dept_result.get("success"):
# Sub-pipeline failed — handle as blocked
results.append({"role": role, "_dept_sub": True, **dept_result})
if pipeline:
models.update_pipeline(
conn, pipeline["id"],
status="failed",
total_cost_usd=total_cost,
total_tokens=total_tokens,
total_duration_seconds=total_duration,
)
error_msg = f"Department {role} sub-pipeline failed"
models.update_task(conn, task_id, status="blocked", blocked_reason=error_msg)
return {
"success": False,
"error": error_msg,
"steps_completed": i,
"results": results,
"total_cost_usd": total_cost,
"total_tokens": total_tokens,
"total_duration_seconds": total_duration,
"pipeline_id": pipeline["id"] if pipeline else None,
}
# Track last worker role from sub-pipeline for auto_complete eligibility
if dept_result.get("last_sub_role"):
_last_sub_role = dept_result["last_sub_role"]
# Override previous_output with dept handoff summary (not raw dept head JSON)
previous_output = dept_result.get("output")
if isinstance(previous_output, (dict, list)):
previous_output = json.dumps(previous_output, ensure_ascii=False)
continue
# Project-level auto-test: run `make test` after backend_dev/frontend_dev steps.
# Enabled per project via auto_test_enabled flag (opt-in).
# On failure, loop fixer up to KIN_AUTO_TEST_MAX_ATTEMPTS times, then block.
@ -1433,7 +1699,9 @@ def run_pipeline(
changed_files = _get_changed_files(str(p_path))
last_role = steps[-1].get("role", "") if steps else ""
auto_eligible = last_role in {"tester", "reviewer"}
# For dept pipelines: if last step is a _head, check the last worker in its sub-pipeline
effective_last_role = _last_sub_role if (_is_department_head(last_role) and _last_sub_role) else last_role
auto_eligible = effective_last_role in {"tester", "reviewer"}
# Guard: re-fetch current status — user may have manually changed it while pipeline ran
current_task = models.get_task(conn, task_id)

View file

@ -151,6 +151,126 @@ specialists:
output_schema:
tasks: "array of { title, brief, priority, category, acceptance_criteria }"
# Department heads — Opus-level coordinators that plan work within their department
# and spawn internal sub-pipelines of Sonnet workers.
backend_head:
name: "Backend Department Head"
model: opus
execution_type: department_head
department: backend
tools: [Read, Grep, Glob]
description: "Plans backend work, coordinates architect/backend_dev/tester within backend department"
permissions: read_only
context_rules:
decisions: all
modules: all
frontend_head:
name: "Frontend Department Head"
model: opus
execution_type: department_head
department: frontend
tools: [Read, Grep, Glob]
description: "Plans frontend work, coordinates frontend_dev/tester within frontend department"
permissions: read_only
context_rules:
decisions: all
modules: all
qa_head:
name: "QA Department Head"
model: opus
execution_type: department_head
department: qa
tools: [Read, Grep, Glob]
description: "Plans QA work, coordinates tester/reviewer within QA department"
permissions: read_only
context_rules:
decisions: all
security_head:
name: "Security Department Head"
model: opus
execution_type: department_head
department: security
tools: [Read, Grep, Glob]
description: "Plans security work, coordinates security engineer within security department"
permissions: read_only
context_rules:
decisions_category: security
infra_head:
name: "Infrastructure Department Head"
model: opus
execution_type: department_head
department: infra
tools: [Read, Grep, Glob]
description: "Plans infrastructure work, coordinates sysadmin/debugger within infra department"
permissions: read_only
context_rules:
decisions: all
research_head:
name: "Research Department Head"
model: opus
execution_type: department_head
department: research
tools: [Read, Grep, Glob]
description: "Plans research work, coordinates tech_researcher/architect within research department"
permissions: read_only
context_rules:
decisions: all
marketing_head:
name: "Marketing Department Head"
model: opus
execution_type: department_head
department: marketing
tools: [Read, Grep, Glob]
description: "Plans marketing work, coordinates tech_researcher/spec within marketing department"
permissions: read_only
context_rules:
decisions: all
modules: all
# Departments — PM uses these when routing complex cross-domain tasks to department heads
departments:
backend:
head: backend_head
workers: [architect, backend_dev, tester, reviewer]
description: "Backend development: API, database, business logic"
frontend:
head: frontend_head
workers: [frontend_dev, tester, reviewer]
description: "Frontend development: Vue, CSS, components, composables"
qa:
head: qa_head
workers: [tester, reviewer]
description: "Quality assurance: testing and code review"
security:
head: security_head
workers: [security, reviewer]
description: "Security: OWASP audit, vulnerability analysis, remediation"
infra:
head: infra_head
workers: [sysadmin, debugger, reviewer]
description: "Infrastructure: DevOps, deployment, server management"
research:
head: research_head
workers: [tech_researcher, architect]
description: "Technical research and architecture planning"
marketing:
head: marketing_head
workers: [tech_researcher, spec]
description: "Marketing: market research, positioning, content strategy, SEO"
# Route templates — PM uses these to build pipelines
routes:
debug:
@ -188,3 +308,27 @@ routes:
spec_driven:
steps: [constitution, spec, architect, task_decomposer]
description: "Constitution → spec → implementation plan → decompose into tasks"
dept_feature:
steps: [backend_head, frontend_head, qa_head]
description: "Full-stack feature: backend dept → frontend dept → QA dept"
dept_fullstack:
steps: [backend_head, frontend_head]
description: "Full-stack feature without dedicated QA pass"
dept_security_audit:
steps: [security_head, qa_head]
description: "Security audit followed by QA verification"
dept_backend:
steps: [backend_head]
description: "Backend-only task routed through department head"
dept_frontend:
steps: [frontend_head]
description: "Frontend-only task routed through department head"
dept_marketing:
steps: [marketing_head]
description: "Marketing task routed through department head"