Full curriculum

Beginner to production — one curriculum.

Eight tracks. 50+ in-depth lessons. 50+ runnable real-world agents & swarms. Every concept is paired with a live demo you can fork in one click. All free.

8
learning tracks
50+
in-depth lessons
50+
one-click agents & swarms
12+
real-world case studies

Learning tracks

A clear path, in order

Start at Track 01 if you're new. Skip ahead if you've shipped with LLMs before — every track stands on its own.

Field manuals · Senior depth

Five field manuals turn this curriculum into a senior-engineer reference.

At the end of Foundations, Engineering Rigor, SQL & BI Agents, Production & Business, and RAG & Frameworks, long-form field manuals go one level below the chapter — tokenization economics, KV-cache math, schema-linking failure modes, EU AI Act obligations, Reciprocal Rank Fusion, embedding lifecycle, framework lock-in — with worked numerical examples and primary-source citations. If you only have time for one pass, the manuals are the difference between knowing the words and knowing the system.

  1. Track 01

    Beginner

    ~3 hours

    Foundations of Generative & Agentic AI

    Start here if you've never built with LLMs. Every concept is explained twice: once like you're 10, once for the engineer in the room.

    What's inside

    • What is a model? (LLM families, base vs instruct, open vs closed)
    • Tokens, context windows, and why they cost money
    • Prompts, system messages, and few-shot patterns
    • Embeddings, vector search, and the retrieval mindset
    • What makes something an "agent" vs a chatbot
    • Glossary of every term you'll see in the wild

    Live templates included

    • First Prompt Lab
    • Token Counter Demo
  2. Track 02

    Beginner → Intermediate

    ~4 hours

    Patterns, Tools & Guardrails

    The seven canonical agentic patterns — wired up live. Tool use, RAG, planner-executor, reflection, routing, parallel fan-out, and HITL approvals.

    What's inside

    • Tool / function calling — the OpenAI schema in plain English
    • Retrieval-Augmented Generation (RAG) with citations
    • Modern RAG variants: hybrid search, contextual retrieval, agentic & multi-modal
    • Graph RAG — entities, relations, multi-hop reasoning (Microsoft GraphRAG style)
    • Planner → Executor pattern
    • Reflection & self-critique loops
    • Routing & classifier-as-controller
    • Parallel fan-out / map-reduce agents
    • Human-in-the-Loop approvals for risky actions
    • Input/output guardrails: PII, prompt injection, schema validation

    Live templates included

    • Product Support Bot (RAG)
    • Graph RAG Researcher swarm (Acme Corp demo KB)
    • Code Reviewer (Tools + Guardrails)
    • Approval Inbox demo
    • Planner-Executor sandbox
  3. Track 03

    Beginner → Advanced

    ~2.5 hours

    Agent Memory: Short-Term & Long-Term

    Why a chatbot forgets you and an agent doesn't. The two memory tiers (STM and LTM), how recall actually works under the hood, and how to configure both per-agent and per-swarm-node — live, on the platform.

    What's inside

    • The mental model: scratchpad (STM) vs notebook (LTM)
    • Sliding window + rolling summary — how STM survives long chats
    • Long-term memory items: facts, preferences, episodic notes, instructions
    • Recall: keyword overlap + score + recency (and where embeddings fit later)
    • Auto-extraction after every turn — what gets saved, what gets skipped
    • Memory tools the agent can call: remember, recall, forget, set, get
    • Swarm scope: share memory with the agent, isolate to one run, or none
    • PII safety, capacity caps, and pruning low-value items

    Live templates included

    • Personal Assistant with LTM
    • Long-Conversation Tutor (STM summarizer)
    • Swarm with shared scratchpad
  4. Track 04

    Intermediate → Advanced

    ~3.5 hours

    Engineering Rigor & Deep Mental Models

    The senior-engineer view of agents: state, planning, multi-agent protocols, control topology, deterministic vs emergent design, failure handling, eval at scale, and system design under latency/cost constraints. With diagrams, code, and citations to the canonical papers.

    What's inside

    • The four axes: state, planning, communication, control topology
    • Deterministic workflows vs emergent agentic loops (Anthropic's line)
    • Failure handling: timeouts, jittered retries, circuit breakers, sagas
    • Idempotency keys + structured-output validation with repair loop
    • Loop detection and step / token / cost ceilings
    • The 4-layer eval pyramid: unit → golden → trajectory → online
    • System design under constraints: latency budgets, model cascading, caching, parallel tools
    • Centralised vs hierarchical vs peer-to-peer vs market topologies

    Live templates included

    • Failure-handling diagrams
    • Eval flywheel reference
    • τ-bench / AgentBench links
  5. Track 05

    Intermediate

    ~3 hours

    Text-to-SQL & Data Agents

    Turn natural language into safe SQL. AST validation, table allow-listing, schema-aware prompting, and the realities of running this in production (Uber QueryGPT-style).

    What's inside

    • Why text-to-SQL is harder than it looks
    • Schema introspection and few-shot grounding
    • AST parsing and validating generated SQL before execution
    • Read-only enforcement and table allow-lists
    • Cost & row-limit guardrails
    • Case study: Uber QueryGPT, Snowflake Cortex Analyst
    • Hands-on: query the SaaS sales lakehouse with English

    Live templates included

    • SQL Analyst Agent
    • RevOps Multi-Agent Swarm
  6. Track 06

    Intermediate → Advanced

    ~4 hours

    Multi-Agent Swarms

    When one agent isn't enough. Build researcher → writer → reviewer pipelines, peer-to-peer collaboration, A2A handoffs, and shared memory across the swarm.

    What's inside

    • Orchestrator vs peer-to-peer architectures
    • Handoff messages, shared scratchpads, and turn limits
    • A2A (Agent-to-Agent) protocol basics
    • When to split a single agent into a swarm
    • Per-node memory scoping (agent / swarm / none)
    • Cost & loop-detection guardrails for swarms
    • Visual swarm canvas — drag, wire, run

    Live templates included

    • Research → Writer → Reviewer swarm
    • Customer Support Triage swarm
    • RevOps SQL Analytics swarm
  7. Track 07

    Advanced

    ~3 hours

    Scaling, Observability & Enterprise

    Production reality: traces, evals, ROI math, security, OpenAI-compatible gateways, multi-provider strategy. Real case studies from Klarna, Uber, Salesforce.

    What's inside

    • Reading execution traces and debugging cost spikes
    • Production eval ops: regression gates in CI, drift alarms, weekly failure review
    • Token, latency & cost dashboards
    • AI security: prompt injection, data exfiltration, PII
    • OpenAI-compatible gateways and multi-provider routing
    • ROI formulas and enterprise cost scenarios
    • Maturity model: from prototype to org-wide platform
    • Case studies: Klarna, Uber, Salesforce Agentforce, BMW

    Live templates included

    • Trace Inspector
    • Budget Caps demo
    • Multi-Provider Gateway
  8. Track 08

    Advanced → Expert

    ~4 hours

    Deep Dives — the production gaps most curriculums skip

    Five hard-won lessons from real production failures: orchestration architecture, deterministic skeletons vs probabilistic workers, MCP security (Confused Deputy + Tool Description Hijacking), Actor-Model swarms with durable state, and heterogeneous routing economics. Assumes you finished Tracks 01–07.

    What's inside

    • Hub-and-Spoke beats both monolithic master agents AND peer-to-peer mesh
    • Thin Agent pattern: deterministic state machine + ephemeral <150-line workers
    • MCP attack surface: Tool Description Hijacking, Confused Deputy, Shadow AI infra
    • Actor Model runtimes: thousands of I/O-bound agents per host with durable checkpoints
    • Heterogeneous routing: SLM routers + frontier-LLM escalation = positive ROI
    • Mapping AgentSwarms to the Levels-of-Autonomy framework (L1 → L5)

    Live templates included

    • Frameworks deep dive (CrewAI / LangGraph / AutoGen)
    • Stack examples by industry

One-click runnables

Don't just read about multi-agent systems — run real ones

Every track ships with production-shaped agents and swarms you can launch in a single click — wired with knowledge bases, SQL tools, RAG retrievers, guardrails, approval gates and observable traces. Open the canvas, fire the suggested prompt, watch each node light up in real time, then fork it and break it.

Live canvas

See every handoff, tool call & token as it streams

Real KBs

Pre-seeded help-center, research & ERP corpora

HITL approvals

Risky actions pause for a human, just like prod

Fork & remix

Edit the system prompt, swap models, re-run

Customer Support

2 swarms · 3 agents

Multi-agent swarms

  • Customer Support Triage

    Classifier → Responder → QA reviewer with human approval

  • LLM-as-a-Judge — Support QA

    Tone + Policy + Technical sub-judges → Chief Magistrate scorecard

Standalone agents

  • Product Support Assistant

    Grounded RAG over a real help-center KB, with citations & refusals

  • Support Ticket Triage Agent

    Auto-tags category + priority for incoming tickets

  • SLA Breach Detector

    Calculates ticket age against SLA and flags every breach

Sales & RevOps

2 swarms · 2 agents

Multi-agent swarms

  • Sales Lead Enrichment

    Intake → Enricher → Scorer → Email drafter (approval-gated)

  • SaaS RevOps — Multi-Agent SQL Analyst

    VP prompt → SQL Planner → Local DB → RevOps Analyst → Strategic Synthesizer → Deal Desk

Standalone agents

  • Sales Outreach Drafter

    Personalized outbound drafts — humans approve before sending

  • CRM Data Cleanser

    Coerces messy text into strict JSON for HubSpot / Salesforce ingestion

Research & Knowledge

3 swarms · 3 agents

Multi-agent swarms

  • Research → Report Writer

    Planner → Researcher → Synthesizer → Editor pipeline

  • Graph RAG Researcher

    Graph search → Document RAG → Synthesizer for multi-hop questions

  • gRED-style Drug Discovery Research

    Planner → Literature + Assay + Internal-data → Reconciliation → Hypothesis memo

Standalone agents

  • Research Q&A on AI Papers

    Cited answers over a curated library of foundational AI papers

  • Long-Context Document Analyst

    Claude-powered summarizer for contracts & long transcripts

  • Graph RAG Explorer (Acme Corp)

    Multi-hop questions over a pre-built entity-relation knowledge graph

Engineering & Quality

3 swarms · 4 agents

Multi-agent swarms

  • Code Review Pipeline

    Static summarizer → Security + Style reviewers → Merged comment

  • RAG Evaluation Harness — LLM as a Judge

    Two candidates → GPT-5 judge → structured rubric scorecard

  • Content Moderation QA with Evaluate Node

    Toxicity + Misinfo + Policy → Moderator → quality-gate Evaluate node

Standalone agents

  • Code Review Assistant

    Reviews a pasted diff for bugs, style and security issues

  • Python Traceback Fixer

    Reads a Python error log and writes the exact fix

  • Regex Generator

    Translates plain English into ready-to-use regular expressions

  • SQL Dialect Translator

    Converts queries between MySQL, Postgres, SQL Server, and SQLite

Failure-pattern labs (debugging)

3 swarms

Multi-agent swarms

  • Infinite Tool Loop

    Watch an agent burn tokens calling the same tool forever — then see the fix

  • JSON Wrapper Crash

    An LLM adds "Sure! Here's the JSON:" — the next agent throws SyntaxError

  • Context Window Collapse

    Three workers dump 40 pages into one Synthesizer — watch it forget the question

Financial Services

4 swarms

Multi-agent swarms

  • Earnings Call Analyst Desk

    Numbers + Tone + Risk → Compliance check → Analyst memo

  • Stock Investment — CIO Swarm

    Fundamental + News + Quant + Risk → CIO investment memo

  • Responsible AI Guardrails (Banking)

    PII Redactor → Safety Classifier → Guardrailed Responder ⊕ Refusal → Audit Log

  • Financial Variance — ERP + RAG

    Orchestrator → ERP Data + RAG Doc Agent → FP&A Synthesizer

Healthcare & Life Sciences

2 swarms

Multi-agent swarms

  • Clinical Intake & Prior-Authorization

    Symptom intake → Triage → Differential → Coding → Prior-auth → Clinician approval

  • Agentic RAG — Drug Safety Investigation

    Router → parallel KB + Graph + SQL retrievers → Self-Eval loop → Synthesizer

Legal, HR & Operations

3 swarms · 1 agent

Multi-agent swarms

  • Contract Redline & Risk Review

    Clause splitter → parallel risk / definitions / jurisdiction reviewers → Partner approval

  • Frontline Hiring Orchestrator

    Manager → Sourcing + Screener + Scheduler + Onboarding → Recruiter approval

  • Disaster Response — Crisis Triage at Scale

    Intake classifier → Severity scorer → Resource matcher → Field-team router

Standalone agents

  • Meeting Action-Item Extractor

    Pulls assigned tasks and deadlines out of long meeting transcripts

Industry verticals

5 swarms

Multi-agent swarms

  • Auto-Claims FNOL Triage (Insurance)

    Intake → Coverage Check (RAG) → Fraud Signals (SQL) → Reserve & Routing

  • Manufacturing — Quality NCR Root-Cause

    Defect intake → Spec lookup → History query → 5-Whys → CAPA draft

  • SOC Alert Triage (Cybersecurity)

    Alert intake → ATT&CK enrichment → SQL correlation → Containment proposal

  • Retail Returns & Reverse-Logistics

    Return intake → Policy RAG → Customer-history SQL → Disposition decision

  • Adaptive Socratic Tutor + Auto-Grader (Education)

    Diagnose misconception → Curriculum RAG → Socratic hint → Grader

Marketing, Web & Content

2 swarms · 7 agents

Multi-agent swarms

  • Autonomous Localization & Compliance

    Creative Director → parallel Copywriter + Designer → Compliance loop

  • Autonomous Ad Campaign Engine

    Brief + reference photo → parallel copy & image gen → vision QA → approved ad

Standalone agents

  • Landing Page Roaster

    Critiques marketing copy against direct-response copywriting frameworks

  • SEO Meta-Tag Generator

    SEO titles + meta descriptions within strict character limits

  • Brand Voice Translator

    Rewrites copy in any persona, from Gen-Z to Victorian novelist

  • Firecrawl Web Summarizer

    Scrapes a live URL and answers a precise question about its content

  • Competitor Feature Tracker

    Scrapes a changelog and surfaces the updates you care about

  • GitHub Repo Explainer

    Reads a public README and explains it for a non-technical audience

  • Invoice Parser

    Extracts vendor, totals, tax and due dates from pasted invoice text

Why this matters for learning: Agentic AI is a systems discipline — handoffs, loops, retries, guardrails, traces. You can't internalize that from prose. Every swarm above is the lesson made tangible: the moment you watch the QA reviewer reject the responder's draft and trigger a retry, or see the Evaluate node fail-close on a toxic generation, the theory clicks. Each runnable is the laboratory for the chapter that introduced it.

Spotlight · Agent Memory

Why a chatbot forgets you — and an agent doesn't

Memory is the single biggest leap from "stateless chatbot" to "real assistant." Here's the same concept explained two ways — once for beginners, once for engineers — with everything you can actually configure on AgentSwarms today.

For beginners

Think of it like a person at a desk

An LLM has no memory by itself — every request starts from scratch. AgentSwarms gives every agent two memory aids:

  • Short-Term Memory (STM) — the scratchpad

    Whatever was said recently in this chat. When the chat gets too long, older parts get squeezed into a one-paragraph summary so the agent never "forgets" the earlier topic.

  • Long-Term Memory (LTM) — the notebook

    Durable notes the agent jots down across every conversation — your name, what you're working on, your preferences. Next week, in a brand-new chat, it still remembers.

In one sentence: STM is "what we just talked about," LTM is "what I know about you."

For engineers

How it actually works under the hood

  • STM = sliding window + rolling summary

    Last N messages (default 20) are sent verbatim. Anything older is folded into a running summary stored on conversation_memory.summaryand prepended as a system block on every turn.

  • LTM = typed memory items + scored recall

    Items are fact / preference / episodic / instruction. Recall ranks by keyword overlap (GIN index) + stored score + recency, top-K injected as a "What you remember about this user" block. Pluggable to embeddings later.

  • Auto-extraction with PII filter

    After each assistant turn, a structured-output pass proposes durable items. Anything matching redaction placeholders ([EMAIL],[PHONE], …) is dropped before storage.

  • Agent-callable tools

    memory_remember, memory_recall, memory_forget, plus memory_set / memory_get for a per-conversation JSON scratchpad shared across swarm nodes.

Per-agent

Memory tab in Agent Builder

Toggle STM/LTM, set the window size, top-K recall, max items, and inspect or delete remembered facts — all without code.

Per-swarm-node

Three LTM scopes

agent shares with the agent's normal sessions. swarm isolates to one run (uses swarm_run_id as the key). none turns LTM off entirely for that node.

Observable

Recall chip in chat

Every assistant message shows how many LTM items were recalled (and a preview), so you can see exactly what the agent "remembered" — no guessing.

Track 08 · Deep Dives

The five production gaps most curriculums skip

Orchestration architecture, deterministic skeletons, MCP security, Actor-Model swarms, and heterogeneous routing economics. Full lessons — including the L1→L5 autonomy mapping — live in the Deep Dives chapter inside the lessons.

After you finish

Then comes the real fun: shipping it to production

Finishing the curriculum gets you to a working agent. The next 12 months are about turning that into a system real users depend on. We mapped the whole journey — for builders and for the leaders who fund them.

01

Pick a real pilot

Narrow scope, measurable success, low blast-radius.

02

Build evals first

50+ case golden set wired into CI before you scale.

03

Harden it

OWASP LLM Top 10, guardrails, HITL on dangerous tools.

04

Observe everything

Traces, cost, drift, weekly failure review.

05

Choose where it runs

Bedrock, Azure AI Foundry, Vertex, AgentKit, edge — pick on data + skills, not hype.

06

Operate it

On-call, change management, model deprecations every 6–12 months.

07

Scale across the org

Platform team, FinOps chargeback, AI policy, regulatory mapping.

Persona checklists

30 / 90 / 365-day plans for both Builders and Leaders.

What's coming

The curriculum keeps growing

We're adding new tracks every few weeks. Here's what's on deck — vote with your feedback on the contact page.

next

Vector recall for memory

Upgrade Long-Term Memory from keyword overlap to embeddings — semantic recall, hybrid search, and when each one wins.

soon

Build-along projects

Multi-day guided projects: ship a customer-support agent, an internal data analyst, and a full RevOps swarm — end-to-end with case-study writeups.

soon

Red-team playbook for agents

Hands-on adversarial labs: jailbreaks, tool-chain hijacks, indirect prompt injection via RAG, and how to write the eval that catches each one.

soon

FinOps for agentic systems

Per-tenant chargeback, model cascading economics, prompt caching ROI, and budget caps that actually hold under load.

Ready to start Track 01?

Sign up free, no credit card. The whole curriculum is yours.