Build

Swarm Canvas

The canvas at /swarms is where single agents become systems. Each node is an agent or a control-flow primitive, each edge is a data route, and the runtime executes the whole graph live — streaming every node's output, cost, and decisions as it runs.

The shape of a multi-agent system is its most important property, and shapes are easier to reason about visually than as nested code. A glance at the canvas shows which agents are involved, where work fans out in parallel, and where the human checkpoints sit. When you want the code instead, every swarm exports to LangGraph, CrewAI, OpenAI Agents SDK, or Strands.

Node types

The palette has ten node kinds:

Input: The entry point — the seed value for the run, supplied in the run panel.
Agent: An LLM call with its own provider, model, temperature, system prompt, tools, and memory scope. The workhorse node.
Condition: A YES/NO router. A small model evaluates a condition prompt against the input and the run follows the YES or NO edge. Unlabeled condition edges raise a validation warning before the run starts.
Router Agent: Picks one of N downstream routes. The routing rubric is an editable prompt and the decision is recorded in the run log.
Loop: Re-runs its prompt until the model signals DONE or the configured maximum iterations is reached — the primitive behind reflection and self-correction.
Approval: Pauses the run and posts a card to your approvals inbox with a title and risk level. The run resumes when you decide. Pending approvals time out rather than hanging forever.
A2A Remote: Delegates a step to a remote agent over the A2A protocol — point it at an endpoint, optionally with streaming.
Function (JS): A sandboxed JavaScript transform with a 2-second timeout. Deterministic glue between model steps: parse, reshape, filter — no LLM involved.
Evaluate: LLM-as-a-judge scoring against weighted metrics (faithfulness, answer relevancy, completeness, coherence, harmlessness) with a pass threshold — a quality gate inside the graph.
Output: The terminal value of the run.

Fan-out is expressed with edges rather than a special node: connect one node to several downstream nodes and the runtime executes the branches in parallel; connect several nodes into one and that node waits for all of its inputs.

Running a swarm

Live execution — each node lights up as it runs, and the run panel streams per-node output alongside a live cost and token meter.
Pre-run validation — the canvas checks the graph before starting (cycles outside Loop nodes, unlabeled condition edges, disconnected nodes) and warns instead of failing midway.
Approvals — when a run hits an Approval node it shows as awaiting approval in the run panel until you decide from the inbox.
Traces — every node execution is recorded with tokens, cost, and latency, and feeds analytics and the trace viewer.

Agents on the canvas

Agent nodes are configured in a side inspector: pick the provider and model (the same provider list as the Agent Builder, from the built-in AgentSwarms AI through OpenAI, Anthropic, Gemini, Bedrock, Vertex, Azure, and self-hosted options), edit the system prompt in place, and choose a long-term memory scope per node — share with the agent's normal memory, isolate to this swarm run, or none.

Templates, tours, and failure labs

Templates — 30+ pre-built swarms load directly onto the canvas, most with a guided tour that follows the run node by node. The gallery shows a thumbnail of each template's actual graph.
Failure labs — deliberately broken swarms (infinite tool loop, JSON wrapper crash, context-window collapse) that you run, watch fail, then fix. The lab checks your fix and tracks which labs you have passed.

Export and publish

Export produces portable JSON, LangGraph (Python or TypeScript), CrewAI, OpenAI Agents SDK, or Strands code — the graph you drew becomes a runnable project in the framework of your choice.
Publish shares the swarm to the community, where others can load and remix it.

A workflow that works

Start from a template, run it unmodified with the tour open, and read the run log end to end before changing anything.
Change one thing at a time — a prompt, a model, a route — and rerun with the same input to see what actually moved.
Put an Evaluate node before the Output early; a quality gate shapes the rest of the design.
Wrap anything that writes to the outside world in an Approval node before sharing the swarm with anyone.

[input] ──▶ [router] ──▶ [specialist A] ──┐
               │                          ├──▶ [evaluate] ──▶ [approval] ──▶ [output]
               └──────▶ [specialist B] ───┘        │
                              ▲ ◀──── fail ────────┘

A canonical swarm: routed intake, parallel specialists, a quality gate, and a human checkpoint

PreviousVoice Agents Next Skills & Prompt Library