What are you trying to do?
I am trying to use Rapid-MLX as a local OpenAI-compatible backend for the current Codex CLI.
Rapid-MLX itself is running correctly. I can access /v1/models and /v1/chat/completions successfully.
What's blocking you today?
Current Codex CLI sends requests to:
POST /v1/responses
But Rapid-MLX returns:
404 Not Found
Example server log:
INFO: 127.0.0.1:50061 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO: 127.0.0.1:50079 - "POST /v1/responses HTTP/1.1" 404 Not Found
So Codex can reach the Rapid-MLX server, but the Responses API endpoint appears to be missing.
Current workaround (if any)
No direct workaround for Codex CLI at the moment.
I can use clients that call /v1/chat/completions or /v1/messages?beta=true, because those endpoints work.
For example:
GET /v1/models 200 OK
POST /v1/chat/completions 200 OK
POST /v1/messages?beta=true 200 OK
But current Codex CLI appears to require /v1/responses, so it cannot connect directly.
Proposed approach (optional)
Add support for the OpenAI Responses API endpoint:
POST /v1/responses
Even a compatibility layer that maps basic /v1/responses requests to the existing /v1/chat/completions or /v1/messages backend would make current Codex CLI usable with Rapid-MLX.
Alternatives you considered
I tried using the existing OpenAI-compatible endpoint with Codex by setting the base URL to:
http://127.0.0.1:11433/v1
However, Codex still calls /v1/responses, which returns 404.
I also tested /v1/chat/completions manually, and it works, so this seems specific to the missing Responses API endpoint rather than model loading or server startup.
Hardware (if relevant)
MacBook Pro M3 MAX 48G
Model (if relevant)
Qwen3.6-35B-A3B mxfp4
Are you willing to contribute this?
No, I'm just requesting
Additional context
No response
What are you trying to do?
I am trying to use Rapid-MLX as a local OpenAI-compatible backend for the current Codex CLI.
Rapid-MLX itself is running correctly. I can access /v1/models and /v1/chat/completions successfully.
What's blocking you today?
Current Codex CLI sends requests to:
POST /v1/responses
But Rapid-MLX returns:
404 Not Found
Example server log:
INFO: 127.0.0.1:50061 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO: 127.0.0.1:50079 - "POST /v1/responses HTTP/1.1" 404 Not Found
So Codex can reach the Rapid-MLX server, but the Responses API endpoint appears to be missing.
Current workaround (if any)
No direct workaround for Codex CLI at the moment.
I can use clients that call /v1/chat/completions or /v1/messages?beta=true, because those endpoints work.
For example:
GET /v1/models 200 OK
POST /v1/chat/completions 200 OK
POST /v1/messages?beta=true 200 OK
But current Codex CLI appears to require /v1/responses, so it cannot connect directly.
Proposed approach (optional)
Add support for the OpenAI Responses API endpoint:
POST /v1/responses
Even a compatibility layer that maps basic /v1/responses requests to the existing /v1/chat/completions or /v1/messages backend would make current Codex CLI usable with Rapid-MLX.
Alternatives you considered
I tried using the existing OpenAI-compatible endpoint with Codex by setting the base URL to:
http://127.0.0.1:11433/v1
However, Codex still calls /v1/responses, which returns 404.
I also tested /v1/chat/completions manually, and it works, so this seems specific to the missing Responses API endpoint rather than model loading or server startup.
Hardware (if relevant)
MacBook Pro M3 MAX 48G
Model (if relevant)
Qwen3.6-35B-A3B mxfp4
Are you willing to contribute this?
No, I'm just requesting
Additional context
No response