Section 12

Certification

The AgentSwarms certification at /certification is a proctored, hands-on examination designed to prove that the holder can actually build agentic systems rather than only describe them. It is, as far as we know, the only certification in this space that combines a multiple-choice knowledge section with a live lab section graded against the same observability signals — cost, latency, correctness, safety — that you would use to evaluate any real agentic system in production. The rest of this page explains how it works end to end.

Eligibility

Eligibility for the exam is gated on completion of the curriculum modules covering the major concept areas: prompting and chat models, retrieval-augmented generation, agents and tool use, multi-agent systems, evaluation, and safety. Your dashboard's curriculum progress ring shows your readiness in real time, and the certification landing page lists the specific modules and lessons you still need to complete before you can schedule the exam. There is no shortcut around the eligibility gate, because the curriculum is not arbitrary coursework — it is the minimum baseline of conceptual coverage you need in order for the lab section to be meaningfully completable.

Exam structure

The exam is timed in two contiguous sections, taken in a single sitting, with a short break between them. Total time budget is approximately three hours including the break, and the exam runs entirely inside the platform; there is no external proctoring software to install. Webcam presence is required during both sections as a soft proctoring measure.

  • The multiple-choice section consists of a rotating set of questions drawn from a much larger pool, covering core concepts (tool calling, structured output, retrieval pipelines, evaluation metrics), specific common failure modes (prompt injection, hallucinated tool arguments, retrieval-relevance collapse), and the safety and operational considerations every practitioner should have internalised. Each question has either one correct answer or, occasionally, multiple correct answers; the interface tells you which.
  • The lab section presents you with a written specification of a small system you need to build on the platform: a description of the problem, the expected behaviour, the constraints you must satisfy (budget cap, latency target, table allow-list, guardrails required), and a set of inputs your submission will be evaluated against. You have access to every feature of the platform during the lab — Agent Builder, Swarm Canvas, knowledge bases, tools, skills — and the only constraint is the time limit and the spec.

Grading

Grading is split between the two sections in a way that deliberately favours hands-on competence over recall. The lab section is weighted more heavily than the multiple- choice section, and a strong multiple-choice score cannot compensate for a weak lab submission, because the inverse bias is exactly the thing we are trying to correct in the existing certification landscape.

  • The multiple-choice section is auto-marked immediately on submission. Each question is worth a known number of points (shown in the interface) and the total is reported to you the moment the section closes.
  • The lab section is graded by an evaluator agent that runs your submission against the spec's hidden test inputs and scores it against a rubric covering correctness (does the system do what was asked), completeness (does it handle the spec's edge cases), cost (does it stay within the configured budget), latency (does it meet the latency target), and safety (does it have the guardrails the spec required). The evaluator's decision and its reasoning are recorded in full and provided to you after results are released.
  • Borderline submissions — typically those within five percentage points of the pass threshold — are additionally reviewed by a human grader who can adjust the evaluator's score either way, with the human's notes attached to the result. This is not a rubber stamp; the human review is the reason borderline candidates can confidently pass on the merits of work that surprised the evaluator.

The certificate itself

On a passing result you receive a verifiable certificate addressed at a stable public URL under /api/certificate/[id]. The certificate page renders the holder's display name, the date of the exam, the version of the curriculum and rubric in force at the time, and a verification stamp; the same URL can be embedded as an image or linked from a CV, a LinkedIn profile, or a personal site. Verification is handled by a public verify endpoint anyone can hit without an AgentSwarms account, so prospective employers can confirm the certificate without our involvement.

Notification of the result is delivered both in-product and by email, including the lab evaluator's reasoning, your per-section score, your overall score, and the threshold for passing. If you pass, the email also contains the certificate URL and a downloadable PDF version suitable for printing.

Retake policy

Failed attempts can be retaken after a seven-day waiting period. The waiting period exists because the most common reason for a failed lab section is a specific concept gap that takes a few days of focused practice to close, and a same-day retake would almost certainly reproduce the same outcome. Lab tasks rotate between attempts so a candidate cannot simply memorise the previous attempt's specification. There is no cap on the number of retakes a single account can attempt over the long run, but retakes within the same calendar month are billed at a small flat rate to cover the cost of human review on borderline submissions.