Concepts

Platform capabilities

Squads, RunGraph, memory governance, proposals, evals, and channels.

Koda is no longer just an agent editor plus a queue. The current platform includes squad coordination, RunGraph evidence, governed memory, skill packages, proposal review, deterministic evals, channel identity, and operator quality surfaces. This page is the public map of those capabilities and their limits.

Agents and squads

Agents still run as individually configured workers, but Koda also has persistent Squad Rooms for multi-agent collaboration. Rooms keep the transcript visible, record route explanations, track reply obligations, and gate final synthesis on open work.

Explicit mentions, replies, coordinator routing, capability scores, and deterministic fallback decide who should answer.
Handoffs are transcript-visible handoff_event.v1 system events, not invisible prompt tricks.
The synthesis gate waits for tasks, child runs, reply obligations, and handoffs to finish or declare terminal timeout/failure.

RunGraph and replay

RunGraph turns runtime evidence into a graph that operators can inspect and replay. It links queue events, model calls, tool calls, policy gates, approvals, artifacts, child runs, evals, release gates, handoffs, and synthesis nodes without storing raw secrets or unsafe prompt text.

Historical traces can be partial when older rows did not record every node. Koda should show missing-data warnings instead of claiming a perfect replay.

Memory governance

Memory now has safety and scope before persistence. The scanner blocks prompt injection, exfiltration instructions, secret path attempts, credential leakage, and invisible/control unicode before memory, proposal, or knowledge-candidate text is stored.

Namespaces cover user, agent, squad, workspace, project, and org.
Recall records selected, dropped, stale, conflict, namespace, sensitivity, source, and trust metadata.
Context-governance and RunGraph payloads receive metadata-only memory blocks, not raw memory text.

Skills and packages

Runtime skills are still markdown methodologies, but Koda also supports local-first skill packages with scanner decisions, package locks, strict per-agent allowlists, registry summaries, eval evidence, trust status, and rollback records.

A skill package cannot bypass the tool dispatcher. Package tool allowlists and direct-call denial keep plugin behavior inside the same policy path as native tools.

Improvement proposals

improvement_proposal.v1 is Koda's governed self-improvement queue. Eval failures, user corrections, timeouts, dead letters, tool failures, runs, and manual operator input can create proposals with evidence refs, redacted diff previews, risk class, validation plans, rollback plans, audit, metrics, and RunGraph lifecycle nodes.

Proposals do not mutate persistent prompts, skills, memory, routing, or policy automatically. The lifecycle is review, approve, validate, apply, and rollback through the effect ledger.

Evals and quality cockpit

Koda ships deterministic eval contracts for agent behavior, squad behavior, RunGraph completeness, skill packages, trajectory export, and release quality. The quality cockpit aggregates evidence from eval runs, metrics, RunGraph refs, executions, and proposals.

release_quality.v1 remains the release gate.
quality_cockpit.v1 is an operator dashboard surface, not a replacement for the release-quality contract.
Failure-to-proposal reuses improvement_proposal.v1 and stays non-mutating until approved and validated.

Channels

Telegram is the production pilot channel. Unknown senders are denied or queued before task enqueue, pairing codes can approve identities, and blocked or revoked identities override legacy allowlists. Slack and Discord share the central channel manager contract and have SDK-free contract tests; live credentialed E2E is still required before calling them production-ready.

Maturity boundary

Evidence-backed, not blanket Robust

Koda has a broad implemented baseline, but the platform does not claim global Robust status. Capability-specific Robust claims still require restart/idempotency, long fault/load runs, fail-closed regressions, observability, eval/smoke evidence, provider-live parity, streaming leak checks, and authenticated browser or live channel E2E where that capability depends on them.

Next steps

Runtime - how tasks, RunGraph, tools, and artifacts move through execution.
Memory & knowledge - recall, retrieval, namespaces, and governance.
Control-plane API - the complete maintained HTTP contract.