Skip to content

API latency: /v0/servers reads consistently take 20–25s #1252

@rhinocap

Description

@rhinocap

Summary

/v0/servers queries on registry.modelcontextprotocol.io return successful responses but with extreme latency (20–25s for trivial queries returning a few KB of JSON). Default-timeout HTTP clients (and most CLI tools) treat these as failures, making the API effectively unusable without explicit long timeouts.

Repro (just now from California, 2026-05-04 ~11:00 PDT)

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s | %{size_download}b\n" \
  "https://registry.modelcontextprotocol.io/v0/servers?search=raven&limit=5"
... [valid JSON list of 5 servers] ...
HTTP 200 | 24.903s | 3947b

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/v0/servers/com.knowledge-raven%2Fmcp/versions/1.0.0"
... [valid server metadata] ...
HTTP 200 | 20.921s | 716b

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/v0/servers?limit=3"
... [valid 3-server response] ...
HTTP 200 | 25.818s

By contrast, the static frontend at / returns in 394ms — same hostname, presumably a different cluster:

$ curl -sI --max-time 8 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/"
HTTP/2 200
HTTP 200 | 0.394s

So this is API-side, not network/DNS.

Why this matters in practice

Most HTTP libraries default to 5–15 second timeouts. With API responses landing at 20–25s, anything calling the API with defaults will see what looks like a hard failure:

  • WebFetch / similar tooling I tried earlier today returned empty bodies and gave up
  • My initial curl --max-time 12 runs all returned HTTP 000 — I assumed the endpoints were broken
  • A first-time integrator doing fetch(...) with default 10s would conclude the API is down

For context: I just published ai.ravenmcp/raven-mcp (registered 2026-05-04T17:26:51Z). Confirming the publish landed required ~10 minutes of guessing the right query shapes because every short-timeout probe returned HTTP 000. The publish succeeded (mcp-publisher exited 0) but verification was needlessly hard.

Hypothesis

The latency profile (~20–25s near-uniform across read endpoints) looks more like a cold-start / per-request resource provisioning issue than slow query execution. A 3-server list response shouldn't be doing 25s of database work. Could be Cloud Run-style cold starts, a per-request DB connection setup, or a synchronous external dependency (e.g. validating against a static schema URL on every request).

Asks

  1. Latency budget on /v0/servers/* — 20–25s for sub-4KB responses is a UX cliff. Sub-1s would be a reasonable target for cached lookups; sub-5s for searches.
  2. If a fix isn't quick — surface a recommended minimum timeout on the API reference docs so integrators don't conclude the API is broken when it's just slow. (Currently nothing in /docs warns about timeouts.)
  3. Health endpoint — a fast /v0/health or similar that confirms "API is responsive" cheaply would let clients distinguish "API hung" from "your timeout is too short."

Happy to share the mcp-publisher publish flow + verification timing data if useful for whoever picks this up.


(Filed by @rhinocap from ai.ravenmcp/raven-mcp publication context.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions