Skip to content
Kodakodadocs
Reference

Runtime API

OpenAPI surface for runtime inspection and control.

The runtime API is served at /api/runtime/* on port 8090. It covers task submission, inspection, and the full trace view that powers the dashboard. The OpenAPI contract at /openapi/control-plane.json is authoritative — this page summarises the routes you'll reach for most.

Authentication

  • Operator session. The dashboard uses the koda_operator_session cookie. API calls from the same origin inherit it automatically.
  • Runtime token. RUNTIME_LOCAL_UI_TOKEN guards the dashboard-to-runtime path when the two sit on different origins.
  • Control-plane API token. CONTROL_PLANE_API_TOKEN (optional) lets CLI tooling authenticate without an operator session.

Health

GET /api/runtime/ready — readiness probe. Returns 200 with a health payload when the runtime is accepting tasks, 503 otherwise. Use it as the liveness check in systemd or Kubernetes.

bash
curl https://koda.example.com/api/runtime/ready

Agents

  • GET /api/runtime/agents — list all runnable agents with their current state (active, paused, draft).
  • GET /api/runtime/agents/:agent_id — full agent detail: model, Skills attached, memory scope, publication metadata.

Tasks

  • POST /api/runtime/tasks — submit a new task. Body: agent id, query, optional metadata and context overrides.
  • GET /api/runtime/tasks/:task_id — task status and final output.
  • GET /api/runtime/tasks/:task_id/trace — the full step trace (provider calls, tool calls, memory hits, retrieval results, audit).

Submitting a task

bash
curl -X POST https://koda.example.com/api/runtime/tasks \
-H "Cookie: koda_operator_session=..." \
-H "Content-Type: application/json" \
-d '{
"agent_id": "atlas",
"query": "Audit the auth service for CSP violations.",
"metadata": { "source": "operator" }
}'

Response shape

Every task returns a JSON record with a stable id, a status (queued, running, completed, failed, retrying, paused), metadata, and — once complete — the final output.

json
{
"id": "tsk_2f7a…",
"agent_id": "atlas",
"status": "running",
"created_at": "2026-04-24T11:05:21Z",
"updated_at": "2026-04-24T11:05:22Z",
"metadata": { "source": "operator" },
"output": null
}

Trace

The trace endpoint returns every step the runtime took, in order. The dashboard's trace view renders exactly this payload.

  • Provider calls — the full prompt + streamed response, with provider-specific metadata (model, tokens, latency).
  • Tool calls — parsed <agent_cmd>, tool result, status, any approval-loop record.
  • Memory hits — which memories were recalled, with score and type.
  • Retrieval hits — ranked knowledge chunks with their lexical / dense / graph ranks.
  • Audit events — every security.* record emitted during the task.

Status codes

  • 200 — success.
  • 202 — task accepted, running asynchronously.
  • 400 — malformed request body.
  • 401 — missing or invalid session / token.
  • 403 — authenticated but not allowed.
  • 404 — no such agent or task.
  • 429 — rate limited. Retry with backoff.
  • 503 — runtime is not ready (check /api/runtime/ready).

Rate limits

The runtime shares the general operator rate bucket with the control plane (120 requests/minute by default, tuneable via CONTROL_PLANE_RATE_LIMIT). Task submission additionally respects per-agent concurrency limits configured on the agent itself.

Prefer the OpenAPI spec
This page summarises the endpoints you'll reach for most. For request/response shapes, authentication requirements, and full error payloads, /openapi/control-plane.json is the source of truth — it's regenerated on every release.

Next steps