-
Notifications
You must be signed in to change notification settings - Fork 114
Description
When using chat_github or other OpenAI-compatible providers and/or spamming with chat_parallel, hitting a rate limit (429 Too Many Requests) causes ellmer to crash with a JSON parsing error instead of retrying or reporting the error.
This occurs because the GitHub Models endpoint returns a Content-Type: application/json header but a plain text body "Too many requests... "etc. when rate limited... i.e. The base_request_error method attempts to parse this as JSON and fails.
Reproduction code
library(ellmer)
# Use a small model to hit rate limits quickly
chat <- chat_github(model = "gpt-4o-mini")
prompts <- as.list(paste("Say hello", 1:50))
# Trigger 429
tryCatch(
parallel_chat(chat, prompts, max_active = 20),
error = function(e) print(e)
)
Proposed Fix-
Update base_request_error in R/provider-openai-compatible.R to wrap resp_body_json() in a tryCatch. If JSON parsing fails, fall back to returning the raw string body.
This allows httr2 to correctly identify the 429 error and apply retry logic.
I'll attempt a concise pull request...