Skip to content

Add eval task: multi-layer routing benchmark #12

@Alexi5000

Description

@Alexi5000

Task

Add a new eval task that benchmarks multi-layer routing — given a user prompt, does the swarm route it to the correct business layer?

File to create

evals/tasks/multi_layer_routing.yaml

Use existing eval files in evals/ as a reference.

Task spec

The eval should include >= 10 test cases, each with:

  • prompt: a realistic business task
  • expected_layer: one of sales, support, marketing, seo, research, operations, management
  • keywords: 3+ words that should appear in the routed agent's output

Example entry

- prompt: "Find the top 3 enterprise CRM tools used by Fortune 500 companies in 2026"
  expected_layer: research
  keywords: [CRM, enterprise, Fortune 500]

Scoring

Pass = routed to correct layer AND all keywords present in output.

How to test

make install
swarm eval

Estimated effort

1 hour


See CONTRIBUTING.md for setup instructions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions