Skip to content

Legacy /v1/completions + boundary cases: 5 issues surfaced in iter7 onboarding sweep #461

@raullenchai

Description

@raullenchai

Found during the recurring babysit onboarding sweep on qwen3.6-27b-8bit (PyPI rapid-mlx 0.6.66). Filing as one consolidated tracker since they cluster around the legacy /v1/completions endpoint + request-boundary validation. Hybrid model with channel-routed chat works fine; legacy/boundary surface is where the gaps live.

1. /v1/completions echo:true silently ignored

curl -X POST .../v1/completions -d '{"model":"qwen3.6-27b-8bit","prompt":"Once upon a time","max_tokens":15,"echo":true}'

Expected: response text begins with "Once upon a time". Actual: starts mid-continuation. echo accepted by schema but never honored.

2. /v1/completions logprobs schema mismatch with OpenAI

OpenAI spec: logprobs: int (0-5). rapid-mlx declares logprobs: bool. Sending "logprobs":3 (the canonical OpenAI form) → HTTP 422 bool_parsing. Sending "logprobs":true is accepted but no logprobs field appears in response — double-ignored.

3. /v1/completions streaming per-chunk id rotation

Each SSE data: chunk gets a fresh UUID (cmpl-d62f..., cmpl-8f7b...) instead of sharing one stream id. OpenAI streaming spec requires all chunks of one completion to share id. Clients that key on id for dedup/aggregation will break.

4. n=0 accepted as n=1 silently

Boundary validation gap. n=0 should be rejected (Pydantic ge=1), but route accepts it and returns 1 choice. n=null/n=1 correct; n=2+ correctly 400s.

5. qwen3.6 default thinking-marker leak in content

Default request (no enable_thinking flag) returns content like "Here's a thinking process:\n\n1. Analyze..." with no reasoning_content populated. Explicit enable_thinking:true parses correctly. Either qwen3.6's default differs from qwen3.5's, or our alias recommended_template_kwargs is omitting enable_thinking. UX hit — default user sees raw reasoning bleeding into content.


Out of scope for the in-flight fix (#460 is harmony-specific channel-routing); filing as a consolidated tracker for future iterations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions