Reference

Runtime API

OpenAPI surface for runtime inspection and control.

The runtime API is served at /api/runtime/* on port 8090. It covers task submission, inspection, and the full trace view that powers the dashboard. The OpenAPI contract at /openapi/control-plane.json is authoritative — this page summarises the routes you'll reach for most.

Authentication

Operator session. The dashboard uses the koda_operator_session cookie. API calls from the same origin inherit it automatically.
Runtime token. RUNTIME_LOCAL_UI_TOKEN guards the dashboard-to-runtime path when the two sit on different origins.
Control-plane API token. CONTROL_PLANE_API_TOKEN (optional) lets CLI tooling authenticate without an operator session.

Health

GET /api/runtime/ready — readiness probe. Returns 200 with a health payload when the runtime is accepting tasks, 503 otherwise. Use it as the liveness check in systemd or Kubernetes.

bash

$curl https://koda.example.com/api/runtime/ready

Agents

GET /api/runtime/agents — list all runnable agents with their current state (active, paused, draft).
GET /api/runtime/agents/:agent_id — full agent detail: model, Skills attached, memory scope, publication metadata.

Tasks

POST /api/runtime/tasks — submit a new task. Body: agent id, query, optional metadata and context overrides.
GET /api/runtime/tasks/:task_id — task status and final output.
GET /api/runtime/tasks/:task_id/trace — the full step trace (provider calls, tool calls, memory hits, retrieval results, audit).

Submitting a task

bash

$curl -X POST https://koda.example.com/api/runtime/tasks \
$  -H "Cookie: koda_operator_session=..." \
$  -H "Content-Type: application/json" \
$  -d '{
$    "agent_id": "atlas",
$    "query": "Audit the auth service for CSP violations.",
$    "metadata": { "source": "operator" }
$  }'

Response shape

Every task returns a JSON record with a stable id, a status (queued, running, completed, failed, retrying, paused), metadata, and — once complete — the final output.

json

{
  "id": "tsk_2f7a…",
  "agent_id": "atlas",
  "status": "running",
  "created_at": "2026-04-24T11:05:21Z",
  "updated_at": "2026-04-24T11:05:22Z",
  "metadata": { "source": "operator" },
  "output": null
}

Trace

The trace endpoint returns every step the runtime took, in order. The dashboard's trace view renders exactly this payload.

Provider calls — the full prompt + streamed response, with provider-specific metadata (model, tokens, latency).
Tool calls — parsed <agent_cmd>, tool result, status, any approval-loop record.
Memory hits — which memories were recalled, with score and type.
Retrieval hits — ranked knowledge chunks with their lexical / dense / graph ranks.
Audit events — every security.* record emitted during the task.

Status codes

200 — success.
202 — task accepted, running asynchronously.
400 — malformed request body.
401 — missing or invalid session / token.
403 — authenticated but not allowed.
404 — no such agent or task.
429 — rate limited. Retry with backoff.
503 — runtime is not ready (check /api/runtime/ready).

Rate limits

The runtime shares the general operator rate bucket with the control plane (120 requests/minute by default, tuneable via CONTROL_PLANE_RATE_LIMIT). Task submission additionally respects per-agent concurrency limits configured on the agent itself.

Prefer the OpenAPI spec

This page summarises the endpoints you'll reach for most. For request/response shapes, authentication requirements, and full error payloads, /openapi/control-plane.json is the source of truth — it's regenerated on every release.

Next steps

Control-plane API — configuration routes for providers, agents, and access.
Runtime concepts — the lifecycle and internal services that back these endpoints.