Runtime API
OpenAPI surface for runtime inspection and control.
The runtime API is served at /api/runtime/* on port 8090. It covers task submission, inspection, and the full trace view that powers the dashboard. The OpenAPI contract at /openapi/control-plane.json is authoritative — this page summarises the routes you'll reach for most.
Authentication
- Operator session. The dashboard uses the
koda_operator_sessioncookie. API calls from the same origin inherit it automatically. - Runtime token.
RUNTIME_LOCAL_UI_TOKENguards the dashboard-to-runtime path when the two sit on different origins. - Control-plane API token.
CONTROL_PLANE_API_TOKEN(optional) lets CLI tooling authenticate without an operator session.
Health
GET /api/runtime/ready — readiness probe. Returns 200 with a health payload when the runtime is accepting tasks, 503 otherwise. Use it as the liveness check in systemd or Kubernetes.
curl https://koda.example.com/api/runtime/readyAgents
GET /api/runtime/agents— list all runnable agents with their current state (active, paused, draft).GET /api/runtime/agents/:agent_id— full agent detail: model, Skills attached, memory scope, publication metadata.
Tasks
POST /api/runtime/tasks— submit a new task. Body: agent id, query, optional metadata and context overrides.GET /api/runtime/tasks/:task_id— task status and final output.GET /api/runtime/tasks/:task_id/trace— the full step trace (provider calls, tool calls, memory hits, retrieval results, audit).
Submitting a task
curl -X POST https://koda.example.com/api/runtime/tasks \ -H "Cookie: koda_operator_session=..." \ -H "Content-Type: application/json" \ -d '{ "agent_id": "atlas", "query": "Audit the auth service for CSP violations.", "metadata": { "source": "operator" } }'Response shape
Every task returns a JSON record with a stable id, a status (queued, running, completed, failed, retrying, paused), metadata, and — once complete — the final output.
{ "id": "tsk_2f7a…", "agent_id": "atlas", "status": "running", "created_at": "2026-04-24T11:05:21Z", "updated_at": "2026-04-24T11:05:22Z", "metadata": { "source": "operator" }, "output": null}Trace
The trace endpoint returns every step the runtime took, in order. The dashboard's trace view renders exactly this payload.
- Provider calls — the full prompt + streamed response, with provider-specific metadata (model, tokens, latency).
- Tool calls — parsed
<agent_cmd>, tool result, status, any approval-loop record. - Memory hits — which memories were recalled, with score and type.
- Retrieval hits — ranked knowledge chunks with their lexical / dense / graph ranks.
- Audit events — every
security.*record emitted during the task.
Status codes
200— success.202— task accepted, running asynchronously.400— malformed request body.401— missing or invalid session / token.403— authenticated but not allowed.404— no such agent or task.429— rate limited. Retry with backoff.503— runtime is not ready (check/api/runtime/ready).
Rate limits
The runtime shares the general operator rate bucket with the control plane (120 requests/minute by default, tuneable via CONTROL_PLANE_RATE_LIMIT). Task submission additionally respects per-agent concurrency limits configured on the agent itself.
/openapi/control-plane.json is the source of truth — it's regenerated on every release.Next steps
- Control-plane API — configuration routes for providers, agents, and access.
- Runtime concepts — the lifecycle and internal services that back these endpoints.