Add eval task: multi-layer routing benchmark

## Task

Add a new eval task that benchmarks **multi-layer routing** — given a user prompt, does the swarm route it to the correct business layer?

## File to create

`evals/tasks/multi_layer_routing.yaml`

Use existing eval files in `evals/` as a reference.

## Task spec

The eval should include >= 10 test cases, each with:
- `prompt`: a realistic business task
- `expected_layer`: one of `sales`, `support`, `marketing`, `seo`, `research`, `operations`, `management`
- `keywords`: 3+ words that should appear in the routed agent's output

## Example entry

```yaml
- prompt: "Find the top 3 enterprise CRM tools used by Fortune 500 companies in 2026"
  expected_layer: research
  keywords: [CRM, enterprise, Fortune 500]
```

## Scoring

Pass = routed to correct layer AND all keywords present in output.

## How to test

```bash
make install
swarm eval
```

## Estimated effort

1 hour

---
See [CONTRIBUTING.md](../../CONTRIBUTING.md) for setup instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add eval task: multi-layer routing benchmark #12

Task

File to create

Task spec

Example entry

Scoring

How to test

Estimated effort

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add eval task: multi-layer routing benchmark #12

Description

Task

File to create

Task spec

Example entry

Scoring

How to test

Estimated effort

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions