Skip to content

Ai gateway: More Features#2965

Open
predic8 wants to merge 56 commits into
masterfrom
ai-gateway
Open

Ai gateway: More Features#2965
predic8 wants to merge 56 commits into
masterfrom
ai-gateway

Conversation

@predic8
Copy link
Copy Markdown
Member

@predic8 predic8 commented May 29, 2026

Summary by CodeRabbit

  • New Features

    • LLM Gateway Interceptor supporting Claude, OpenAI, and Google Gemini providers.
    • Token usage policies with input/output limits and per-user quota enforcement.
    • API key sharing and management for multi-user scenarios.
    • System prompt customization (append, prepend, overwrite, remove).
    • Server-Sent Events (SSE) parser for streaming responses.
    • Multipart form and XOP (MTOM) handling utilities.
  • Bug Fixes

    • Binary message detection in log output to prevent corrupted logs.
  • Documentation

    • LLM Gateway tutorial configurations and examples for Claude, OpenAI, and Google Gemini.

predic8 added 30 commits May 7, 2026 15:21
- Introduced thread-safe `AtomicLong` for token management.
- Synchronized reset logic in `AiApiLimit`.
- Improved error handling and null checks in OpenAI API interactions.
- Default-initialized user list in `SimpleAiApiStore`.
- Fixed getter for `AiApiStore` in interceptor.
- Removed `AiUtil` and replaced with modular `AiProvider` interface.
- Added provider implementations: `Claude`, `OpenAI`.
- Updated `OpenAIAPIInterceptor` to use configurable providers and enforce model restrictions.
- Introduced `NoAiApiLimit` for simplified limit management.
- Enhanced error handling with model validation in `OpenAiApiUtil`.
…bstraction

- Added `AiApiRequest` and `AiApiResponse` abstractions for request/response handling.
- Introduced `AbstractAiApiRequest` and `AbstractAiApiResponse` as base classes.
- Implemented providers: `Google`, `OpenAI`, and `Claude` with concrete request/response handling.
- Updated `AiProvider` to handle request/response creation.
- Refactored `OpenAIAPIInterceptor` to leverage request/response abstraction and enforce contract restrictions.
- Enhanced `JsonUtil` with helper methods for JSON body parsing and updates.
- Updated `AiApiStore` and related classes for improved usage tracking and user abstraction.
…ponents

- Updated `checkLimit` method to consider input and output tokens.
- Improved token calculation logic in AI request providers.
- Enhanced JSON parsing in `JsonUtil` with Optional for safer operations.
- Added detailed error handling in `OpenAIAPIInterceptor` for invalid requests.
- Refined token estimation logic with safety margins and JSON structure considerations.
- Synchronized token management methods in `AiApiLimit` to ensure thread safety.
- Adjusted log levels for `SimpleAiApiStore` to reduce verbosity.
- Added PostgreSQL dependency to the distribution.
- Updated logging configuration to set debug level for AI interceptors.
- Introduced `SSEUtil` for parsing Server-Sent Events (SSE) from chunks.
- Enhanced `AbstractAiApiRequest` to handle JSON requests conditionally.
- Deprecated and replaced `max_output_tokens` usage in specific providers.
- Improved stream support in `OpenAiAiRequest` with response usage tracking.
- Refactored token limit logic in `OpenAIAPIInterceptor` for better flow.
- Renamed `AiApiRequest` to `LLMRequest` and `AiApiResponse` to `LLMResponse`.
- Updated providers (`Google`, `OpenAI`, `Claude`) to align with `LLMProvider` interface.
- Refactored `OpenAIAPIInterceptor` to `LLMGatewayInterceptor` and related utilities.
- Removed `SSEUtil` and replaced with `SSEParser`.
- Improved streaming and token usage handling in `AbstractLLMResponse`.
- Modularized LLM responses for `Claude` and `OpenAI` providers.
- Replaced `LLMResponse` interface and `AbstractLLMResponse` with updated abstractions.
- Added `ChatCompletionsSSEParser` for advanced SSE chunk handling.
- Introduced specific SSE event classes: `ChatCompletionEvent`, `ChatCompletionDoneEvent`, `ResponsesApiEvent`.
- Renamed and restructured classes for consistency in AI namespace.
- Improved token usage tracking and event-based streaming.
…acking logic

- Deleted `ChatCompletionsSSEParser` and related classes/methods.
- Simplified `ChatCompletionEvent` by removing token usage parsing.
- Updated tool extraction logic in `AbstractLLMRequest` to handle function-specific tools.
… usage tracking

- Added `OpenAiLLMResponsesAPIResponse` for handling OpenAI Responses API.
- Refactored `OpenAiLLMResponse` to `OpenAiChatCompletionsLLMResponse`.
- Improved token usage calculations and SSE event processing.
- Updated `OpenAIProvider` to differentiate between Responses API and Chat Completions.
- Introduced SLF4J logger to `AbstractLLMRequest`.
- Added log message for handling non-JSON requests.
- Improved exception handling with informative runtime error.
…dling

- Added `10-Basic-LLM-Gateway.yaml` tutorial for setting up a basic LLM gateway.
- Introduced new classes `AbstractOpenAiLLMRequest` and `OpenAiLLMChatCompletionsRequest` for modularizing token estimation and API handling.
- Improved token usage tracking with client-requested max output tokens.
- Added detailed inline documentation across AI-related classes for better maintainability.
- Updated `membrane.cmd` and `membrane.sh` for enhanced gateway setup.
…acking

- Removed redundant `isResponsesAPI` variable in `OpenAIProvider`.
- Optimized tool extraction in `OpenAiLLMResponsesRequest` and `OpenAiLLMChatCompletionsRequest`.
- Updated `AiApiLimit` to support unlimited tokens with `MAX_VALUE`.
- Replaced `token` with `apiKey` in `AiApiUser` along with added `tokens` field.
- Improved JSON parsing logic in `JsonUtil` with better exception handling and logging.
- Adjusted output token parameter naming in multiple request classes for consistency.
…r handling

- Deleted `AiApiLimit` and `NoAiApiLimit` classes, consolidating token management into `SimpleAiApiStore`.
- Introduced `LLMErrorCreator` and its implementations (`OpenAiErrorCreator`, etc.) for reusable error generation.
- Refactored `LLMGatewayInterceptor` to utilize provider-specific error creators, simplifying token and model validation.
- Enhanced `SimpleAiApiStore` with token reset functionality and user-specific token tracking.
- Updated tutorials and examples to align with this refactored approach.
predic8 and others added 25 commits May 20, 2026 15:34
- Added `resetTokensUsedInPeriod` for user-specific token reset in `SimpleAiApiStore`.
- Improved inline documentation for methods across `AiApiUser`, `AiApiStore`, and `LLMGatewayInterceptor`.
- Updated parameter descriptions for clarity and consistency.
- Added synchronized blocks to `SimpleAiApiStore` for thread-safe access to user data.
- Introduced `invalidRequestError` to `LLMErrorCreator` and implemented it in `OpenAiErrorCreator`.
- Allowed unlimited tokens for users with `MAX_VALUE` in `AiApiUser`.
- Simplified logic in `LLMGatewayInterceptor` for token and model validation.
- Updated tutorial JSON with test input for validation.
- Added Claude-specific error handling with `ClaudeErrorCreator` and `ClaudeErrorResponse`.
- Introduced `10-Basic-LLM-Gateway.yaml` tutorial for Claude integration.
- Enhanced token usage tracking in `ClaudeLLMResponse`.
- Updated examples and tutorials to support both OpenAI and Claude.
…ring and token limit examples

- Added `10-Basic-LLM-Gateway.yaml` and `20-Sharing-API-Keys.yaml` tutorials for Google Gemini.
- Enhanced Claude tutorials with improved key handling and token limit examples.
- Introduced `GoogleErrorCreator` for detailed error handling in Google LLM Gateway.
- Updated `LLMGatewayInterceptor` and token tracking logic to reflect effective max token handling.
- Modified existing OpenAI and Claude examples for consistency and clarity.
…tutorials

- Introduced `AbstractAiTutorialTest` base class and provider-specific extensions for easier test creation.
- Added integration tests for basic gateway setups and API key sharing for Claude, OpenAI, and Google Gemini.
- Simulated upstream mock APIs to enable testing token limits, key handling, and input/output transformations.
…for LLM Gateway

- Ensure thread-safe access to users in `SimpleAiApiStore` with `List.copyOf`.
- Introduce `visibleRemaining` to handle non-negative token values in `GoogleErrorCreator`.
- Add configuration validation in `LLMGatewayInterceptor` to enforce API key substitution.
- Enhance token limit handling to adjust output tokens dynamically in `LLMGatewayInterceptor`.
- Update Google and Claude tutorials with clearer instructions for API key usage and token limits.
- Introduced `StreamingOpenAiLLMGatewayTutorialTest` with SSE mocking and validation.
- Added JSON fixtures (`stream.json`, `max-output-stream.json`) for testing streaming requests.
- Enhanced base test framework to support `text/event-stream` responses.
- Updated `LLMGatewayInterceptor` to handle streaming scenarios with capped tokens.
- Replaced raw API key placeholders with `TEST_API_KEY` constant in tutorial tests to ensure consistency.
- Added `TEST_API_KEY` to `AbstractAiTutorialTest` for upstream key substitution verification.
- Updated `log4j2.xml` to limit logging to `com.predic8.membrane.core.interceptor.ai`.
- Introduced PostgreSQL dependency in `pom.xml` for future enhancements.
…dd policies support

- Unified OpenAI and Chat Completions error handling under `ChatCompletionsErrorCreator`.
- Deprecated older OpenAI-specific classes in favor of `ChatCompletions` equivalents.
- Introduced detailed usage policies handling in `LLMGatewayInterceptor`.
- Updated YAML tutorials to reflect the new `policies` configuration model.
…licy-based token and model restrictions

- Added `Policies` class for defining restrictions on models, input tokens, and output tokens in the LLM Gateway.
- Replaced `maxInputTokens` and `maxOutputTokens` fields in `LLMGatewayInterceptor` with `Policies`.
- Updated YAML tutorials (OpenAI, Claude, Google) to use the new `policies` configuration.
- Replaced `Policies` class implementation with `DefaultPolicies` and `NullPolicies` for enhanced flexibility.
- Added `SystemPrompt` class to support dynamic system prompt management in LLM Gateway.
- Updated `LLMGatewayInterceptor` to delegate policy enforcement and system prompt handling to respective components.
- Extended providers (OpenAI, Claude, Google Gemini) with standardized system prompt methods (`getSystemPrompt`, `setSystemPrompt`, `removeSystemPrompt`).
- Enhanced test coverage with `AbstractLLMRequestTest` for API key handling and bearer token case insensitivity.
- Added `setSystemPrompt`, `removeSystemPrompt`, and `isChatCompletion` methods for enhanced prompt management.
- Refactored `SystemPrompt.Action` to remove unused `REJECT` action.
- Updated `AbstractLLMRequestTest` to validate new `SystemPrompt` behaviors.
…g across providers

- Added validation for token limits in `DefaultPolicies` and `AiApiUser` classes.
- Refactored system prompt methods (`setSystemPrompts`, `getRequestedMaxOutputTokens`) for consistency across LLM providers.
- Standardized concatenation logic for multi-prompt handling in providers (Claude, OpenAI, Google, Chat Completions).
- Enhanced error handling with `ConfigurationException` in token-related attributes.
…ultipart utility

- Introduced `AbstractLLMProvider` to streamline and centralize LLM request type handling.
- Added `AbstractLLMRequest` and extended support for specialized requests (e.g., `AudioRequest`, `ImagesRequest`, `FilesRequest`, `OrganizationRequest`).
- Implemented `MultipartUtil` to simplify handling of multipart HTTP messages.
- Updated providers (OpenAI, Claude, Chat Completions) to align with new abstractions and support for `IOException`.
- Enhanced test coverage with `MultipartUtilTest` and additional provider-specific unit tests.
…uest handling

- Introduced `AbstractModelInputRequest` to support model input parsing for JSON and multipart requests.
- Added `ChatCompletionsRequest` support in `OpenAIProvider` to handle `/v1/chat/completions` URI.
- Simplified `DefaultPolicies` by consolidating `checkModel` call.
- Minor cleanup in `MultipartUtilTest` and improved method signature in `XOPReconstitutor`.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b3478df6-af56-4d9f-87c2-b915846b2ca1

📥 Commits

Reviewing files that changed from the base of the PR and between a7caaa4 and 96bc067.

📒 Files selected for processing (3)
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/BaseLLMRequest.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java

📝 Walkthrough

Walkthrough

Introduces an LLM Gateway with providers (OpenAI/Claude/Google), policies, and API-key/user stores; adds multipart parsing and XOP reconstitution updates; implements SSE parsing; extends HTTP/JSON utilities; updates logging for binary bodies; and adds tutorials and integration/unit tests.

Changes

LLM Gateway and Multipart Utilities

Layer / File(s) Summary
LLM gateway core interceptor and policies
core/.../llmgateway/LLMGatewayInterceptor.java, .../Policies.java, .../DefaultPolicies.java, .../NullPolicies.java
Interceptor routes/model-input handling, token/model checks, system prompt mutation, and no-op/default policies.
Providers and request/response models
core/.../provider/*, .../openai/*, .../claude/*, .../google/*
Provider abstractions and concrete OpenAI/Claude/Google implementations for requests, responses (including SSE), and error creators.
Stores and usage accounting
core/.../llmgateway/store/*
API user model, in-memory and JDBC stores, usage record type, and token-limit checks.
SSE, multipart, and helpers
core/.../util/http/SSEParser.java, core/.../multipart/*, core/http/*, core/util/json/JsonUtil.java, core/util/jdbc/AbstractJdbcSupport.java, core/interceptor/log/LogInterceptor.java
SSE parser, multipart split and XOP reconstitution, header/message helpers, JSON body utilities, JDBC connection helper, binary logging guard.
Tutorials and tests
distribution/tutorials/..., core/src/test/..., distribution/src/test/...
YAML tutorials, fixtures, and integration/unit tests across providers, SSE, multipart, and utility behavior.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Gateway as LLMGatewayInterceptor
  participant Policy as Policies
  participant Store as AiApiStore
  participant Prov as LLMProvider
  participant Upstream

  Client->>Gateway: POST /v1/... (LLM request)
  Gateway->>Prov: getLLMRequest(exc)
  Gateway->>Store: getUser(apiKey)
  alt unknown user
    Gateway-->>Client: 401 authenticationFailed()
  else known user
    Gateway->>Policy: handleRequest(mir, exc)
    alt rejected
      Gateway-->>Client: 4xx provider error
    else allowed
      Gateway->>Upstream: Forward (API key/body)
      Upstream-->>Gateway: JSON/SSE response
      Gateway->>Prov: getLLMResponse(exc, postProcessor)
      Gateway->>Store: store(usage)
      Gateway-->>Client: Proxied response
    end
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

7.x

Suggested reviewers

  • rrayst

Poem

A rabbit hops through streams of text,
With tokens counted, prompts perplexed.
It splits the parts, rethreads the seams,
Gates the models, budgets dreams.
Through SSE it rides the flow—
“All clear!” it thumps, then off we go. 🐇✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ai-gateway
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch ai-gateway

@membrane-ci-server
Copy link
Copy Markdown

This pull request needs "/ok-to-test" from an authorized committer.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
distribution/tutorials/ai/llm-gateway/claude/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use Windows (CRLF) line endings for this batch script.

This .cmd file currently has LF-only line endings. The Windows cmd.exe parser can mis-handle GOTO/CALL label resolution with LF-only files (512-byte boundary parsing bug), and this script depends on :search_up/:found/:notfound labels. Convert to CRLF and consider enforcing it via .gitattributes (e.g. *.cmd text eol=crlf) so it survives checkout/packaging.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/claude/membrane.cmd` around lines 1 -
25, The batch script uses LF-only line endings which can break cmd.exe label
parsing; convert this file to use Windows CRLF line endings and commit the
change so labels like :search_up, :found, and :notfound are preserved;
additionally add or update repository .gitattributes to enforce CRLF for .cmd
files (for example add a rule such as "*.cmd text eol=crlf") so files like
membrane.cmd and the call to scripts\run-membrane.cmd retain correct line
endings across checkouts.
distribution/tutorials/ai/llm-gateway/google/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use CRLF line endings for distribution/tutorials/ai/llm-gateway/google/membrane.cmd (lines 1-25).

This .cmd file is LF-only (CRLF=0), which can lead to unpredictable batch parsing/behavior on Windows.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd` around lines 1 -
25, This batch script uses LF-only endings which can break on Windows; convert
the file containing the :search_up / :found labels and the MEMBRANE_HOME logic
(the block that calls "%MEMBRANE_HOME%\scripts\run-membrane.cmd") to use CRLF
line endings, then commit the change and optionally add a .gitattributes rule
(e.g., for *.cmd text eol=crlf) to ensure future normalization.
🧹 Nitpick comments (9)
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMProvider.java (1)

17-32: 💤 Low value

Inconsistent URI matching (startsWith vs contains).

The chat/completions and files routes use startsWith, while images/audio/organization use contains. contains can match the substring anywhere (e.g., in a query string), routing requests unexpectedly. Consider using startsWith consistently with the /v1/... prefixes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMProvider.java`
around lines 17 - 32, The URI matching is inconsistent: replace the uses of
uri.contains(...) for ImagesRequest, AudioRequest, and OrganizationRequest with
uri.startsWith(...) so all /v1/... routes use prefix matching; locate the block
that obtains uri from exchange.getRequest().getUri() and update the conditions
that construct new ImagesRequest(exchange), new AudioRequest(exchange), and new
OrganizationRequest(exchange) to use startsWith to avoid incidental matches in
query strings or elsewhere.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java (1)

9-35: 💤 Low value

Bearer prefix matched without a delimiter; reuse the constant in setApiKey.

getApiKey() matches BEARER_PREFIX ("Bearer") via regionMatches and then strips the first 6 chars. A malformed header like Bearertoken (no space) passes the check and yields "token". Consider requiring a separating space/whitespace after the prefix. Also, setApiKey hardcodes "Bearer " instead of reusing BEARER_PREFIX.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java`
around lines 9 - 35, Ensure setApiKey reuses the BEARER_PREFIX constant instead
of hardcoding "Bearer " (use BEARER_PREFIX + " " when adding the header) and
make getApiKey validate that the prefix is followed by a delimiter (e.g., check
the character at index BEARER_PREFIX.length() exists and is whitespace) before
extracting the token; if the delimiter is missing or token is empty return null.
Reference: BEARER_PREFIX, setApiKey, getApiKey,
exchange.getRequest().getHeader().
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java (2)

54-56: 💤 Low value

INFO-level logging on every request part is noisy.

These run on the request hot path for each multipart part. Consider lowering to debug to avoid log volume in production.

♻️ Proposed change
-                    log.info("Part: name={} type={} size={}", part.getName(), part.getContentType(), part.getBody().length);
+                    log.debug("Part: name={} type={} size={}", part.getName(), part.getContentType(), part.getBody().length);
                     if ("model".equals(part.getName())) {
-                        log.info("Model: {}", part.getBodyAsString());
+                        log.debug("Model: {}", part.getBodyAsString());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java`
around lines 54 - 56, The current per-part info logs in
AbstractModelInputRequest (the log.info calls that print part details and the
model body) are too noisy on the request hot path; change those log.info calls
to log.debug for both the general part line and the "Model:" line so they only
appear when debug logging is enabled (locate the log statements that reference
part.getName(), part.getContentType(), part.getBody().length and
part.getBodyAsString()).

61-65: 💤 Low value

Merge duplicate catch blocks; reconsider wrapping IOException.

The two catch blocks have identical bodies and can be combined. Also, the constructor already declares throws IOException, so catching IOException only to rethrow it as a RuntimeException loses the checked-exception contract; consider letting it propagate.

♻️ Proposed change
-            } catch (IOException e) {
-                throw new RuntimeException(e);
-            } catch (ParseException e) {
-                throw new RuntimeException(e);
-            }
+            } catch (ParseException e) {
+                throw new RuntimeException(e);
+            }

(letting IOException propagate via the existing throws clause)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java`
around lines 61 - 65, The two separate catch blocks in the constructor of
AbstractModelInputRequest both rethrow a RuntimeException and should be merged;
specifically remove the catch(IOException) so the declared throws IOException on
the constructor can allow IO errors to propagate, and keep a single
catch(ParseException e) that wraps/handles parse errors (e.g., throw new
RuntimeException(e) or a more specific exception) — update the constructor in
AbstractModelInputRequest to eliminate the duplicate catch and let IOException
propagate while still handling ParseException.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/JSONRequest.java (1)

5-8: 💤 Low value

Filename/type mismatch.

The file is JSONRequest.java but declares interface JSONMessage. It compiles since the type is package-private, but renaming the file to JSONMessage.java improves discoverability and consistency. The sibling files also carry the Apache license header, which is absent here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/JSONRequest.java`
around lines 5 - 8, The file JSONRequest.java declares a package-private
interface named JSONMessage which mismatches the filename and lacks the
project's Apache license header; rename the file to JSONMessage.java (or
alternatively rename the interface to JSONRequest if that better fits intent) so
the public type and filename align, and add the standard Apache license header
to the top of the file to match sibling files; locate the interface declaration
(interface JSONMessage) to apply the change and update any imports/usages
referencing JSONMessage or JSONRequest accordingly.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java (1)

35-38: ⚡ Quick win

Fix the Jackson 2.14 argument-style claim.

Jackson 2.14+ only treats the withObject/withArray(String) argument as a JsonPointer when it is "" or starts with /; "messages" is therefore treated as a literal property name (not via fallback/deprecated behavior). For consistency/readability, consider using json.withObject("stream_options") instead of json.withObject("/stream_options") (same direct child field).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java`
around lines 35 - 38, The Jackson call in ChatCompletionsRequest uses
json.withObject("/stream_options") which relies on deprecated/ambiguous
JsonPointer behavior; update the call to json.withObject("stream_options") so
Jackson treats it as the direct child property, leaving the rest of the logic
(setting streamOptions.put("include_usage", true)) unchanged.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AudioRequest.java (1)

1-1: ⚡ Quick win

Missing Apache license header.

AudioRequest.java, FilesRequest.java, and ImagesRequest.java start directly with the package declaration, unlike the other new files in this PR that carry the Apache 2.0 header. If the build enforces license headers (e.g. RAT/license-maven-plugin), this will fail CI.

📝 Add the standard header
+/* Copyright 2026 predic8 GmbH, www.predic8.com
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License. */
+
 package com.predic8.membrane.core.interceptor.llmgateway.provider.openai;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AudioRequest.java`
at line 1, Add the standard Apache 2.0 license header to the top of the new
source files that currently start with the package declaration — specifically
AudioRequest.java, FilesRequest.java, and ImagesRequest.java — by inserting the
exact project's Apache license comment block (matching the other new files in
this PR) immediately above the package line so the header is present for license
checkers.
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java (1)

127-127: 💤 Low value

Typo in private method name: getJsonObjectFromSteamgetJsonObjectFromStream.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` at line
127, The private method name getJsonObjectFromSteam is misspelled; rename it to
getJsonObjectFromStream and update all internal references to that method
accordingly. Ensure the method signature (parameters/return type) remains
unchanged and update any unit tests or callers inside JsonUtil that invoke
getJsonObjectFromSteam to call getJsonObjectFromStream instead. After renaming,
run the build to catch any remaining references and adjust imports/visibility if
the rename affects access.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java (1)

99-101: 💤 Low value

Base class carries Claude-specific defaults.

getUpstreamApiUrl() (anthropic) and mockResponse() (Claude JSON) are provider-specific defaults living in the generic base. OpenAI and Google get dedicated Abstract*TutorialTest subclasses, but Claude tests extend this base directly and silently rely on these defaults. For consistency and to avoid a future provider accidentally inheriting Claude behavior, consider an AbstractClaudeTutorialTest and making mockResponse()/getUpstreamApiUrl() abstract here.

Also applies to: 162-168

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`
around lines 99 - 101, AbstractAiTutorialTest currently embeds Claude-specific
defaults in getUpstreamApiUrl() and mockResponse(), so make those
provider-specific by converting getUpstreamApiUrl() and mockResponse() in
AbstractAiTutorialTest into abstract methods, create a new
AbstractClaudeTutorialTest that extends AbstractAiTutorialTest and implements
getUpstreamApiUrl() returning "https://api.anthropic.com" and the
Claude-specific mockResponse() JSON, and update all Claude-related test classes
to extend AbstractClaudeTutorialTest instead of AbstractAiTutorialTest (also
move any other Claude defaults found around mockResponse()/getUpstreamApiUrl()
to the new subclass).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`:
- Around line 48-53: The code calls JsonUtil.getJsonObject(sse.data()) and logs
when the returned Optional "opt" is empty but then proceeds to call opt.get(),
which causes NoSuchElementException; modify the logic in AbstractLLMEvent so
that after detecting opt.isEmpty() (the log.info("Unknown event format: {}",
sse.data()) branch) you return early (or otherwise skip further processing)
instead of continuing to use opt.get(); locate the block around
JsonUtil.getJsonObject, sse.data(), and opt.get() and add the early exit to
prevent calling opt.get() on an empty Optional.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java`:
- Line 104: Fix the JavaDoc typo in DefaultPolicies by changing the incorrect
tag "`@desciption`" to the correct "`@description`" in the JavaDoc block (the
comment that reads "* `@desciption` Restricts the models that can be used by the
gateway."). Update the JavaDoc for the DefaultPolicies class/method so the tag
reads "`@description` Restricts the models that can be used by the gateway." and
re-run formatter/compile to ensure no remaining Javadoc warnings.
- Around line 124-127: The setter setMaxOutputTokens in DefaultPolicies throws
IllegalArgumentException for negative values while setMaxInputTokens throws
ConfigurationException; change setMaxOutputTokens to throw
ConfigurationException for consistency during configuration parsing. Update the
method setMaxOutputTokens to validate maxOutputTokens < 0 and throw new
ConfigurationException with a clear message (matching the style used in
setMaxInputTokens), and ensure the class imports or references
ConfigurationException as used elsewhere in DefaultPolicies.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java`:
- Around line 83-86: AbstractLLMResponse#isError() can NPE because the json
field may be null for streaming responses; update isError to safely handle a
null json (or ensure json is initialized in streaming constructors).
Specifically, modify AbstractLLMResponse.isError() to first check if json ==
null and return false (or an appropriate default) before accessing
json.get("error"), and/or initialize json in streaming response classes (e.g.,
ChatCompletionsResponse, GoogleLLMResponse, ClaudeLLMResponse) or in
OpenAiLLMResponsesResponse#processTerminalEvent so that json is always non-null
when isError() may be called. Ensure references to json and the isError method
are the only changes so callers need no update.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java`:
- Around line 118-121: The default implementation of
AbstractModelInputRequest#getRequestedMaxOutputTokens returns 0 which
contradicts the ModelInputRequest contract that uses -1 to mean “no limit”;
change the default to return -1 so callers interpret unset as no limit. Update
the return value in AbstractModelInputRequest#getRequestedMaxOutputTokens to -1
(or alternatively update the ModelInputRequest documentation if you intend 0 to
mean no limit) to keep semantics consistent across enforcement logic and the
documented contract.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsErrorCreator.java`:
- Around line 32-38: The 429 response built in tokenLimitExceeded currently
tells clients to wait but doesn’t set a Retry-After header; update the
tokenLimitExceeded method in ChatCompletionsErrorCreator to add a "Retry-After"
header (value = tokenResetInSeconds as seconds) to the ResponseBuilder before
calling build(), keeping the existing
statusCode(429).json(envelope(...)).build() flow so SDKs and HTTP clients can
programmatically honor the wait window.
- Around line 32-66: In ChatCompletionsErrorCreator.tokenLimitExceeded, add a
Retry-After header to the Response using the tokenResetInSeconds value (e.g.,
set header "Retry-After" to tokenResetInSeconds) so the 429 response includes a
programmatic backoff hint; modify the Response builder in tokenLimitExceeded to
include the header before build() (keep the existing status, body and message
intact).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeProvider.java`:
- Line 28: Fix the typo in the provider description by updating the
javadoc/config description string in ClaudeProvider so "Anthroic" is changed to
"Anthropic"; locate the annotation or javadoc comment around the ClaudeProvider
class (symbol: ClaudeProvider) and correct the `@description` text to read
"(Experimental) Anthropic Claude provider configuration" so generated docs show
the correct vendor name.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockStart.java`:
- Around line 23-32: ContentBlockStart.from currently unsafely casts
on.path("content_block") to ObjectNode which throws if the field is missing or
not an object; change the method to first assign JsonNode cb =
on.path("content_block") (add import com.fasterxml.jackson.databind.JsonNode),
check cb.isObject() before casting, and only call ToolUse.from((ObjectNode)cb)
when true—otherwise leave cbs fields unset or return the empty cbs to avoid
ClassCastException during SSE parsing.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIProvider.java`:
- Around line 36-55: The getLLMResponse implementation currently only
special-cases "/v1/responses" and otherwise returns ChatCompletionsResponse,
causing non-chat endpoints (e.g. /v1/files, /v1/images, /v1/audio) to be parsed
incorrectly; update OpenAIProvider.getLLMResponse to mirror getLLMRequest (or
delegate to AbstractLLMProvider) by routing "/v1/files" -> the FilesResponse
implementation, "/v1/images" -> the ImagesResponse implementation, "/v1/audio"
-> the AudioResponse implementation (or return a clear unsupported/fast-fail
response if those response classes are not available), and keep "/v1/responses"
-> OpenAiLLMResponsesResponse and chat default -> ChatCompletionsResponse so
request and response routing are symmetric (refer to getLLMRequest,
AbstractLLMProvider, OpenAiLLMResponsesResponse, ChatCompletionsResponse,
FilesRequest/ImagesRequest/AudioRequest to locate corresponding handlers).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OrganizationRequest.java`:
- Line 1: Add the standard Apache 2.0 license header to OrganizationRequest.java
by inserting the same copyright/license block used in other classes in this
package (for example OpenAIProvider and AbstractOpenAiLLMRequest) at the very
top of the file, above the package declaration; ensure the header text exactly
matches the project’s existing Apache 2.0 block (including copyright year and
owner) so the file is consistent with the repository licensing.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java`:
- Around line 71-84: The JDBCAiApiUsageStore currently returns placeholders that
break auth and limit logic; update JDBCAiApiUsageStore so its unimplemented
methods fail fast: change getUser(String token), checkLimit(AiApiUser user, long
inputTokens, long outputTokens), and getRemainingResetTime() to throw
UnsupportedOperationException (with a clear message like "JDBCAiApiUsageStore
not implemented") instead of returning Optional.empty() or 0; this prevents
LLMGatewayInterceptor from treating the stub as valid behavior until a real
implementation is provided.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/SimpleAiApiStore.java`:
- Around line 59-64: In SimpleAiApiStore.getUser, the stream filter uses
u.getApiKey().equals(token) which can NPE if AiApiUser.apiKey is null; change
the null-unsafe comparison to a null-safe check (e.g. use
Objects.equals(u.getApiKey(), token) or compare token.equals(u.getApiKey())
after ensuring token is non-null) inside the synchronized(lock) block so
users.stream().filter(...) cannot throw; update the getUser method to perform
this null-safe comparison when locating the matching AiApiUser.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/SystemPrompt.java`:
- Around line 46-49: The REMOVE branch in SystemPrompt currently logs the system
prompt content with log.info; change that to log.debug for consistency with
other branches and to avoid exposing sensitive prompt text — update the logging
call in the REMOVE case that currently reads log.info("Removing instructions:
{}", instructions) to use log.debug, leaving the call to
mir.removeSystemPrompt() unchanged so behavior remains the same.

In `@core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java`:
- Around line 137-160: The parser misinterprets a CRLF split across chunks: when
a chunk ends with '\r' the current readLine consumes it and the next chunk's
leading '\n' becomes a spurious empty line; modify SSEParser to track a boolean
pendingCr flag (add field pendingCr) such that findLineEnd/readLine logic and
the top of parse skip a leading '\n' if pendingCr is true, and when readLine
encounters a terminator '\r' at the end of the buffer (i.e. lone trailing CR)
set pendingCr = true instead of treating it as a completed line; also clear
pendingCr whenever a non-'\n' character is encountered or when you explicitly
consume a paired '\r\n'.

In `@distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml`:
- Line 3: Fix the typo in the tutorial title header: change "Antropic" to
"Anthropic" in the title string within the 10-Basic-LLM-Gateway.yaml tutorial
header so the displayed tutorial name correctly reads "Basic LLM Gateway
(Anthropic Claude)".

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd`:
- Around line 1-24: The batch script membrane.cmd currently uses LF-only line
endings which can fail on Windows; open the membrane.cmd file (look for labels
like :search_up, :found, :notfound and calls to scripts\run-membrane.cmd) and
convert/save the file with CRLF (\r\n) line endings so Windows cmd.exe
interprets it correctly before merging.

---

Outside diff comments:
In `@distribution/tutorials/ai/llm-gateway/claude/membrane.cmd`:
- Around line 1-25: The batch script uses LF-only line endings which can break
cmd.exe label parsing; convert this file to use Windows CRLF line endings and
commit the change so labels like :search_up, :found, and :notfound are
preserved; additionally add or update repository .gitattributes to enforce CRLF
for .cmd files (for example add a rule such as "*.cmd text eol=crlf") so files
like membrane.cmd and the call to scripts\run-membrane.cmd retain correct line
endings across checkouts.

In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd`:
- Around line 1-25: This batch script uses LF-only endings which can break on
Windows; convert the file containing the :search_up / :found labels and the
MEMBRANE_HOME logic (the block that calls
"%MEMBRANE_HOME%\scripts\run-membrane.cmd") to use CRLF line endings, then
commit the change and optionally add a .gitattributes rule (e.g., for *.cmd text
eol=crlf) to ensure future normalization.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMProvider.java`:
- Around line 17-32: The URI matching is inconsistent: replace the uses of
uri.contains(...) for ImagesRequest, AudioRequest, and OrganizationRequest with
uri.startsWith(...) so all /v1/... routes use prefix matching; locate the block
that obtains uri from exchange.getRequest().getUri() and update the conditions
that construct new ImagesRequest(exchange), new AudioRequest(exchange), and new
OrganizationRequest(exchange) to use startsWith to avoid incidental matches in
query strings or elsewhere.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java`:
- Around line 9-35: Ensure setApiKey reuses the BEARER_PREFIX constant instead
of hardcoding "Bearer " (use BEARER_PREFIX + " " when adding the header) and
make getApiKey validate that the prefix is followed by a delimiter (e.g., check
the character at index BEARER_PREFIX.length() exists and is whitespace) before
extracting the token; if the delimiter is missing or token is empty return null.
Reference: BEARER_PREFIX, setApiKey, getApiKey,
exchange.getRequest().getHeader().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java`:
- Around line 54-56: The current per-part info logs in AbstractModelInputRequest
(the log.info calls that print part details and the model body) are too noisy on
the request hot path; change those log.info calls to log.debug for both the
general part line and the "Model:" line so they only appear when debug logging
is enabled (locate the log statements that reference part.getName(),
part.getContentType(), part.getBody().length and part.getBodyAsString()).
- Around line 61-65: The two separate catch blocks in the constructor of
AbstractModelInputRequest both rethrow a RuntimeException and should be merged;
specifically remove the catch(IOException) so the declared throws IOException on
the constructor can allow IO errors to propagate, and keep a single
catch(ParseException e) that wraps/handles parse errors (e.g., throw new
RuntimeException(e) or a more specific exception) — update the constructor in
AbstractModelInputRequest to eliminate the duplicate catch and let IOException
propagate while still handling ParseException.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java`:
- Around line 35-38: The Jackson call in ChatCompletionsRequest uses
json.withObject("/stream_options") which relies on deprecated/ambiguous
JsonPointer behavior; update the call to json.withObject("stream_options") so
Jackson treats it as the direct child property, leaving the rest of the logic
(setting streamOptions.put("include_usage", true)) unchanged.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/JSONRequest.java`:
- Around line 5-8: The file JSONRequest.java declares a package-private
interface named JSONMessage which mismatches the filename and lacks the
project's Apache license header; rename the file to JSONMessage.java (or
alternatively rename the interface to JSONRequest if that better fits intent) so
the public type and filename align, and add the standard Apache license header
to the top of the file to match sibling files; locate the interface declaration
(interface JSONMessage) to apply the change and update any imports/usages
referencing JSONMessage or JSONRequest accordingly.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AudioRequest.java`:
- Line 1: Add the standard Apache 2.0 license header to the top of the new
source files that currently start with the package declaration — specifically
AudioRequest.java, FilesRequest.java, and ImagesRequest.java — by inserting the
exact project's Apache license comment block (matching the other new files in
this PR) immediately above the package line so the header is present for license
checkers.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Line 127: The private method name getJsonObjectFromSteam is misspelled; rename
it to getJsonObjectFromStream and update all internal references to that method
accordingly. Ensure the method signature (parameters/return type) remains
unchanged and update any unit tests or callers inside JsonUtil that invoke
getJsonObjectFromSteam to call getJsonObjectFromStream instead. After renaming,
run the build to catch any remaining references and adjust imports/visibility if
the rename affects access.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`:
- Around line 99-101: AbstractAiTutorialTest currently embeds Claude-specific
defaults in getUpstreamApiUrl() and mockResponse(), so make those
provider-specific by converting getUpstreamApiUrl() and mockResponse() in
AbstractAiTutorialTest into abstract methods, create a new
AbstractClaudeTutorialTest that extends AbstractAiTutorialTest and implements
getUpstreamApiUrl() returning "https://api.anthropic.com" and the
Claude-specific mockResponse() JSON, and update all Claude-related test classes
to extend AbstractClaudeTutorialTest instead of AbstractAiTutorialTest (also
move any other Claude defaults found around mockResponse()/getUpstreamApiUrl()
to the new subclass).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 096c388f-1806-4248-b6c0-6e900aa81dc8

📥 Commits

Reviewing files that changed from the base of the PR and between c74ea86 and a7caaa4.

📒 Files selected for processing (131)
  • annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java
  • core/src/main/java/com/predic8/membrane/core/http/Header.java
  • core/src/main/java/com/predic8/membrane/core/http/Message.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ChatCompletionDoneEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ChatCompletionEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/NullPolicies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/Policies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ResponsesApiEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/SystemPrompt.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMMessage.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/JSONRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/ModelInputRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeErrorResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/MessageDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AbstractOpenAiLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AudioRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/FilesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/ImagesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAiLLMResponsesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAiLLMResponsesResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OrganizationRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/Usage.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/log/LogInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPPing.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
  • core/src/main/java/com/predic8/membrane/core/multipart/MultipartUtil.java
  • core/src/main/java/com/predic8/membrane/core/multipart/Part.java
  • core/src/main/java/com/predic8/membrane/core/multipart/XOPReconstitutor.java
  • core/src/main/java/com/predic8/membrane/core/security/AbstractSecurityScheme.java
  • core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
  • core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
  • core/src/test/java/com/predic8/membrane/core/http/HeaderTest.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequestTest.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
  • core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
  • core/src/test/java/com/predic8/membrane/core/multipart/MultipartUtilTest.java
  • core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/BasicClaudeLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/AbstractGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/BasicGoogleLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/AbstractOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/BasicOpenAiLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/StreamingOpenAiLLMGatewayTutorialTest.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/claude/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/claude/membrane.sh
  • distribution/tutorials/ai/llm-gateway/claude/simple.json
  • distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/max-input.json
  • distribution/tutorials/ai/llm-gateway/google/max-output.json
  • distribution/tutorials/ai/llm-gateway/google/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/google/membrane.sh
  • distribution/tutorials/ai/llm-gateway/google/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/openai/max-input.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output-stream.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output.json
  • distribution/tutorials/ai/llm-gateway/openai/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/openai/membrane.sh
  • distribution/tutorials/ai/llm-gateway/openai/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/stream.json
  • distribution/tutorials/ai/llm-gateway/openai/wrong-model.json

Comment on lines +48 to +53
var opt = JsonUtil.getJsonObject(sse.data());
if (opt.isEmpty()) {
log.info("Unknown event format: {}", sse.data());
}

var json = opt.get();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing early return causes NoSuchElementException.

When JsonUtil.getJsonObject(sse.data()) returns an empty Optional (line 48), the code logs the unknown format (line 50) but continues to line 53, where opt.get() throws NoSuchElementException.

🐛 Proposed fix
     var opt = JsonUtil.getJsonObject(sse.data());
     if (opt.isEmpty()) {
         log.info("Unknown event format: {}", sse.data());
+        return null;
     }

     var json = opt.get();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`
around lines 48 - 53, The code calls JsonUtil.getJsonObject(sse.data()) and logs
when the returned Optional "opt" is empty but then proceeds to call opt.get(),
which causes NoSuchElementException; modify the logic in AbstractLLMEvent so
that after detecting opt.isEmpty() (the log.info("Unknown event format: {}",
sse.data()) branch) you return early (or otherwise skip further processing)
instead of continuing to use opt.get(); locate the block around
JsonUtil.getJsonObject, sse.data(), and opt.get() and add the early exit to
prevent calling opt.get() on an empty Optional.


/**
* @param models List of models that can be used by the gateway.
* @desciption Restricts the models that can be used by the gateway.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Typo in JavaDoc tag.

The tag @desciption should be @description.

📝 Proposed fix
-     * `@desciption` Restricts the models that can be used by the gateway.
+     * `@description` Restricts the models that can be used by the gateway.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* @desciption Restricts the models that can be used by the gateway.
* `@description` Restricts the models that can be used by the gateway.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java`
at line 104, Fix the JavaDoc typo in DefaultPolicies by changing the incorrect
tag "`@desciption`" to the correct "`@description`" in the JavaDoc block (the
comment that reads "* `@desciption` Restricts the models that can be used by the
gateway."). Update the JavaDoc for the DefaultPolicies class/method so the tag
reads "`@description` Restricts the models that can be used by the gateway." and
re-run formatter/compile to ensure no remaining Javadoc warnings.

Comment on lines +124 to +127
public void setMaxOutputTokens(int maxOutputTokens) {
if (maxOutputTokens < 0) {
throw new IllegalArgumentException("maxOutputTokens must be >= 0");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inconsistent exception type for configuration validation.

setMaxOutputTokens throws IllegalArgumentException for the >= 0 constraint, while setMaxInputTokens (lines 141-144) throws ConfigurationException for the same constraint. Since both are @MCAttribute setters invoked during configuration parsing, they should throw the same exception type. Prefer ConfigurationException to align with configuration-time validation.

🔧 Proposed fix
     `@MCAttribute`
     public void setMaxOutputTokens(int maxOutputTokens) {
         if (maxOutputTokens < 0) {
-            throw new IllegalArgumentException("maxOutputTokens must be >= 0");
+            throw new ConfigurationException("maxOutputTokens must be >= 0");
         }
         this.maxOutputTokens = maxOutputTokens;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java`
around lines 124 - 127, The setter setMaxOutputTokens in DefaultPolicies throws
IllegalArgumentException for negative values while setMaxInputTokens throws
ConfigurationException; change setMaxOutputTokens to throw
ConfigurationException for consistency during configuration parsing. Update the
method setMaxOutputTokens to validate maxOutputTokens < 0 and throw new
ConfigurationException with a clear message (matching the style used in
setMaxInputTokens), and ensure the class imports or references
ConfigurationException as used elsewhere in DefaultPolicies.

Comment on lines +83 to +86
@Override
public boolean isError() {
return json.get("error") != null && !json.get("error").isNull();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find call sites of isError() and how streaming subclasses set json
rg -nP '\bisError\s*\(' --type=java -C3
rg -nP '\bjson\s*=' --type=java -g '*LLMResponse*.java' -C2

Repository: membrane/api-gateway

Length of output: 4854


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java"

# Show the relevant parts of AbstractLLMResponse (field + constructor branches + isError)
sed -n '1,140p' "$FILE" | cat -n

# Find all call sites of isError() on LLMResponse / AbstractLLMResponse
rg -nP '(\.|\b)isError\s*\(\s*\)' --type=java -S core/src/main/java
rg -nP 'isError\s*\(\s*\)' --type=java -S core/src/test/java

# Also ensure no other LLMResponse implementations override isError or initialize json differently
rg -nP 'class\s+\w+LLMResponse' --type=java core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider
rg -nP '`@Override`\s+public\s+boolean\s+isError\s*\(' --type=java core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider

Repository: membrane/api-gateway

Length of output: 5568


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect streaming terminal event handlers for all AbstractLLMResponse subclasses
for f in core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/*/*LLM*Response.java core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/*.java; do
  if [[ -f "$f" ]]; then
    rg -n "class .*LLMResponse|processTerminalEvent|json\s*=" "$f" || true
  fi
done

# 2) Find where LLMResponse.isError() is called
# (restrict to llmgateway-related packages to avoid unrelated isError implementations)
rg -n --hidden --glob 'core/src/main/java/**' \
  '\.isError\s*\(\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S -C2 || true

# 3) Also catch direct 'isError()' usage where variable names might not be explicit
rg -n --hidden --glob 'core/src/main/java/**' \
  'isError\s*\(\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S -C2 || true

Repository: membrane/api-gateway

Length of output: 2076


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect remainder of AbstractLLMResponse around json==null
FILE="core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java"
sed -n '70,140p' "$FILE" | cat -n

# Inspect processTerminalEvent implementations
for f in \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAiLLMResponsesResponse.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMResponse.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMResponse.java
do
  echo "==== $f ===="
  sed -n '1,140p' "$f" | cat -n
done

# Broad search: call sites of isError() in llmgateway package
rg -n --type=java '\.isError\s*\(\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S -C2

# Also search for variables typed as LLMResponse and any subsequent isError usage nearby
rg -n --type=java 'LLMResponse\s+\w+' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S -C1

Repository: membrane/api-gateway

Length of output: 11327


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for isError() usage across repo (not just llmgateway), focusing on LLMResponse type contexts.
rg -n --type=java '\.isError\s*\(\s*\)' . -S

# Narrow: any local/field named response typed as LLMResponse
rg -n --type=java '(LLMResponse)\s+\w+\s*=\s*.*;' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true

# Narrow: patterns like "if (.*isError())" in llmgateway code
rg -n --type=java 'if\s*\(.*\.isError\s*\(\s*\)\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S -C2 || true

Repository: membrane/api-gateway

Length of output: 207


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find all instantiations of AbstractLLMResponse subclasses (constructors pass postProcessor)
rg -n --type=java 'new\s+\w+LLM\w*Response\s*\(\s*exchange\s*,\s*([^)]+)\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java 'new\s+[A-Za-z0-9_]*LLMResponsesResponse\s*\(' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java 'new\s+ClaudeLLMResponse\s*\(' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java 'new\s+GoogleLLMResponse\s*\(' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java 'new\s+OpenAiLLMResponsesResponse\s*\(' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true

# 2) Search for any usage of isError() via method references / lambdas
rg -n --type=java '::\s*isError\b' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java '\bisError\s*\b' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true

# 3) Broaden: any isError() calls anywhere (method refs might have been missed earlier)
rg -n --type=java '\bisError\s*\(\s*\)' . -S || true
rg -n --type=java '\bisError\s*\b' . -S | head -n 200 || true

Repository: membrane/api-gateway

Length of output: 3820


🏁 Script executed:

#!/bin/bash
set -euo pipefail

for f in \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIProvider.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleProvider.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeProvider.java
do
  echo "==== $f ===="
  sed -n '1,220p' "$f" | cat -n
done

Repository: membrane/api-gateway

Length of output: 7636


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where the LLM gateway executes postProcessor and whether it calls LLMResponse.isError()
rg -n --type=java 'postProcessor' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S

# Find where LLMProvider#getLLMResponse is called (look for "getLLMResponse(" usage)
rg -n --type=java 'getLLMResponse\s*\(' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S

# If any postProcessor lambdas reference isError, find them
rg -n --type=java 'postProcessor.*isError|isError.*postProcessor' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true

# Also scan other llm-gateway related packages for isError usage on LLMResponse
rg -n --type=java 'LLMResponse.*isError|\.isError\s*\(\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true

Repository: membrane/api-gateway

Length of output: 5564


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect LLMGatewayInterceptor where it creates postProcessor lambda
FILE="core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java"
sed -n '120,230p' "$FILE" | cat -n

# Inspect ChatCompletionsResponse to see how it initializes json for streaming
FILE2="core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsResponse.java"
sed -n '1,220p' "$FILE2" | cat -n

Repository: membrane/api-gateway

Length of output: 7455


🏁 Script executed:

#!/bin/bash
set -euo pipefail

for f in \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsResponse.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMResponse.java \
  core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMResponse.java
do
  echo "==== $f (json assignments) ===="
  rg -n 'json\s*=' "$f" || true
  echo "==== $f (processTerminalEvent override) ===="
  rg -n 'processTerminalEvent' "$f" || true
  echo
done

# Confirm no call sites of LLMResponse.isError() beyond method definitions/tests
rg -n --type=java '\.isError\s*\(\s*\)' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S || true
rg -n --type=java '\.isError\s*\(\s*\)' . -S | head -n 50 || true

Repository: membrane/api-gateway

Length of output: 1169


Potential NPE in AbstractLLMResponse#isError() for some streaming providers where json is never initialized.

File: core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java
Lines: 83-86

`@Override`
public boolean isError() {
    return json.get("error") != null && !json.get("error").isNull();
}

json is not set in the streaming constructor path of AbstractLLMResponse; for streaming responses, it’s only initialized by OpenAiLLMResponsesResponse#processTerminalEvent. Other streaming implementations (e.g., ChatCompletionsResponse, GoogleLLMResponse, ClaudeLLMResponse) do not assign json, so isError() would throw an NPE if called for those responses. (No current call sites of LLMResponse.isError() were found in the repository beyond the method/test definitions.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java`
around lines 83 - 86, AbstractLLMResponse#isError() can NPE because the json
field may be null for streaming responses; update isError to safely handle a
null json (or ensure json is initialized in streaming constructors).
Specifically, modify AbstractLLMResponse.isError() to first check if json ==
null and return false (or an appropriate default) before accessing
json.get("error"), and/or initialize json in streaming response classes (e.g.,
ChatCompletionsResponse, GoogleLLMResponse, ClaudeLLMResponse) or in
OpenAiLLMResponsesResponse#processTerminalEvent so that json is always non-null
when isError() may be called. Ensure references to json and the isError method
are the only changes so callers need no update.

Comment on lines +118 to +121
@Override
public long getRequestedMaxOutputTokens() {
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP '\bgetRequestedMaxOutputTokens\s*\(' --type=java -C3

Repository: membrane/api-gateway

Length of output: 8974


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show computeEffectiveMaxOutputTokens implementation
rg -n "computeEffectiveMaxOutputTokens" -S core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java
sed -n '1,220p' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java | nl -ba | sed -n '120,190p'

# Show DefaultPolicies.checkTokenLimits logic around requestedMaxOutputTokens handling
sed -n '1,200p' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java | nl -ba | sed -n '50,130p'

# Find and inspect store.checkLimit / budgeting usage of effectiveMaxTokens
rg -n "checkLimit\\(" -S core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store
rg -n "effectiveMaxTokens" -S core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway

Repository: membrane/api-gateway

Length of output: 365


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show computeEffectiveMaxOutputTokens + surrounding call site
sed -n '120,190p' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java

# Show DefaultPolicies.checkTokenLimits logic around requestedMaxOutputTokens handling
sed -n '60,120p' core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java

# Show store.checkLimit/budgeting logic using effectiveMaxTokens
rg -n "checkLimit\\(" core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store -S
rg -n "effectiveMaxTokens" core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway -S

Repository: membrane/api-gateway

Length of output: 6413


Align AbstractModelInputRequest#getRequestedMaxOutputTokens() with the ModelInputRequest “no limit” contract (-1)

In core/.../AbstractModelInputRequest.java (lines 118-121), getRequestedMaxOutputTokens() returns 0, but gateway enforcement treats requestedMaxOutputTokens <= 0 as “no max requested” (so 0 won’t be interpreted as a zero-token limit). Still, this contradicts the documented contract (“-1 if no limit is set”); return -1 by default (or update the contract/doc) to keep semantics consistent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractModelInputRequest.java`
around lines 118 - 121, The default implementation of
AbstractModelInputRequest#getRequestedMaxOutputTokens returns 0 which
contradicts the ModelInputRequest contract that uses -1 to mean “no limit”;
change the default to return -1 so callers interpret unset as no limit. Update
the return value in AbstractModelInputRequest#getRequestedMaxOutputTokens to -1
(or alternatively update the ModelInputRequest documentation if you intend 0 to
mean no limit) to keep semantics consistent across enforcement logic and the
documented contract.

Comment on lines +59 to +64
@Override
public Optional<AiApiUser> getUser(String token) {
synchronized (lock) {
return users.stream().filter(u -> u.getApiKey().equals(token)).findFirst();
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Possible NPE when a configured user has no apiKey.

AiApiUser.apiKey has no default and is only populated via the optional @MCAttribute. If any user element omits apiKey, u.getApiKey().equals(token) throws NullPointerException during lookup.

🛡️ Null-safe comparison
-            return users.stream().filter(u -> u.getApiKey().equals(token)).findFirst();
+            return users.stream().filter(u -> java.util.Objects.equals(u.getApiKey(), token)).findFirst();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@Override
public Optional<AiApiUser> getUser(String token) {
synchronized (lock) {
return users.stream().filter(u -> u.getApiKey().equals(token)).findFirst();
}
}
`@Override`
public Optional<AiApiUser> getUser(String token) {
synchronized (lock) {
return users.stream().filter(u -> java.util.Objects.equals(u.getApiKey(), token)).findFirst();
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/SimpleAiApiStore.java`
around lines 59 - 64, In SimpleAiApiStore.getUser, the stream filter uses
u.getApiKey().equals(token) which can NPE if AiApiUser.apiKey is null; change
the null-unsafe comparison to a null-safe check (e.g. use
Objects.equals(u.getApiKey(), token) or compare token.equals(u.getApiKey())
after ensuring token is non-null) inside the synchronized(lock) block so
users.stream().filter(...) cannot throw; update the getUser method to perform
this null-safe comparison when locating the matching AiApiUser.

Comment on lines +46 to +49
case REMOVE -> {
log.info("Removing instructions: {}", instructions);
mir.removeSystemPrompt();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid logging system prompt content at INFO level.

The REMOVE branch logs the existing system prompt at info, whereas the other branches log at debug. System prompts can carry sensitive/proprietary instructions, so prefer debug here for consistency and to avoid exposure in normal logs.

🔒 Proposed fix
             case REMOVE -> {
-                log.info("Removing instructions: {}", instructions);
+                log.debug("Removing instructions: {}", instructions);
                 mir.removeSystemPrompt();
             }
Based on learnings: only DEBUG-level logging of sensitive raw content is exempt; sensitive data at INFO/WARN/ERROR should still be flagged.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case REMOVE -> {
log.info("Removing instructions: {}", instructions);
mir.removeSystemPrompt();
}
case REMOVE -> {
log.debug("Removing instructions: {}", instructions);
mir.removeSystemPrompt();
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/SystemPrompt.java`
around lines 46 - 49, The REMOVE branch in SystemPrompt currently logs the
system prompt content with log.info; change that to log.debug for consistency
with other branches and to avoid exposing sensitive prompt text — update the
logging call in the REMOVE case that currently reads log.info("Removing
instructions: {}", instructions) to use log.debug, leaving the call to
mir.removeSystemPrompt() unchanged so behavior remains the same.

Comment on lines +137 to +160
private static int findLineEnd(StringBuilder buffer) {
for (int i = 0; i < buffer.length(); i++) {
char c = buffer.charAt(i);
if (c == '\n' || c == '\r') {
return i;
}
}
return -1;
}

private static String readLine(StringBuilder buffer, int lineEnd) {
String line = buffer.substring(0, lineEnd);

int removeUntil = lineEnd + 1;

if (lineEnd + 1 < buffer.length()
&& buffer.charAt(lineEnd) == '\r'
&& buffer.charAt(lineEnd + 1) == '\n') {
removeUntil++;
}

buffer.delete(0, removeUntil);
return line;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Cross-chunk \r\n split produces a spurious blank line.

readLine only collapses a \r\n pair when both characters are already buffered. If a chunk ends with \r and the next chunk begins with \n, the \r is consumed as a terminator, then the leading \n of the following chunk is read as an empty line — which parse interprets as the event-terminating blank line and finalizes the current event prematurely. Per the SSE spec, CRLF is a single line terminator.

Track a pending carriage return so a lone leading \n after a consumed \r is skipped rather than treated as a blank line.

🐛 Sketch of a fix
     private final StringBuilder buffer = new StringBuilder();
+    private boolean pendingCr;
...
-        int lineEnd;
-        while ((lineEnd = findLineEnd(buffer)) >= 0) {
+        // Drop a leading '\n' that completes a '\r\n' split across chunks
+        if (pendingCr) {
+            if (!buffer.isEmpty() && buffer.charAt(0) == '\n') {
+                buffer.deleteCharAt(0);
+            }
+            pendingCr = false;
+        }
+        int lineEnd;
+        while ((lineEnd = findLineEnd(buffer)) >= 0) {

and in readLine, set pendingCr = true when the terminator is a trailing lone \r at the buffer end.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java` around
lines 137 - 160, The parser misinterprets a CRLF split across chunks: when a
chunk ends with '\r' the current readLine consumes it and the next chunk's
leading '\n' becomes a spurious empty line; modify SSEParser to track a boolean
pendingCr flag (add field pendingCr) such that findLineEnd/readLine logic and
the top of parse skip a leading '\n' if pendingCr is true, and when readLine
encounters a terminator '\r' at the end of the buffer (i.e. lone trailing CR)
set pendingCr = true instead of treating it as a completed line; also clear
pendingCr whenever a non-'\n' character is encountered or when you explicitly
consume a paired '\r\n'.

@@ -0,0 +1,28 @@
# yaml-language-server: $schema=https://www.membrane-api.io/v7.2.1.json
#
# Tutorial: Basic LLM Gateway (Antropic Claude)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Typo: "Antropic" → "Anthropic".

✏️ Proposed fix
-# Tutorial: Basic LLM Gateway (Antropic Claude)
+# Tutorial: Basic LLM Gateway (Anthropic Claude)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Tutorial: Basic LLM Gateway (Antropic Claude)
# Tutorial: Basic LLM Gateway (Anthropic Claude)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml` at
line 3, Fix the typo in the tutorial title header: change "Antropic" to
"Anthropic" in the title string within the 10-Basic-LLM-Gateway.yaml tutorial
header so the displayed tutorial name correctly reads "Basic LLM Gateway
(Anthropic Claude)".

Comment on lines +1 to +24
@echo off
setlocal EnableExtensions

set "SCRIPT_DIR=%~dp0"
if "%SCRIPT_DIR:~-1%"=="\" set "SCRIPT_DIR=%SCRIPT_DIR:~0,-1%"

set "dir=%SCRIPT_DIR%"

:search_up
if exist "%dir%\LICENSE.txt" if exist "%dir%\scripts\run-membrane.cmd" goto found
for %%A in ("%dir%\..") do set "next=%%~fA"
if /I "%next%"=="%dir%" goto notfound
set "dir=%next%"
goto search_up

:found
set "MEMBRANE_HOME=%dir%"
set "MEMBRANE_CALLER_DIR=%SCRIPT_DIR%"
call "%MEMBRANE_HOME%\scripts\run-membrane.cmd" %*
exit /b %ERRORLEVEL%

:notfound
>&2 echo Could not locate Membrane root. Ensure directory structure is correct.
exit /b 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify line endings for the tutorial batch script (read-only)
python - <<'PY'
from pathlib import Path
p = Path("distribution/tutorials/ai/llm-gateway/openai/membrane.cmd")
b = p.read_bytes()
crlf = b.count(b"\r\n")
lf = b.count(b"\n")
print(f"{p}: CRLF={crlf}, LF={lf}, contains_crlf={crlf>0}")
PY

Repository: membrane/api-gateway

Length of output: 158


Convert distribution/tutorials/ai/llm-gateway/openai/membrane.cmd to CRLF line endings.

membrane.cmd is LF-only (no \r\n), which can break/behave inconsistently on Windows batch execution. Convert the file to CRLF before merge.

🧰 Tools
🪛 Blinter (1.0.112)

[error] 1-1: Unix line endings detected. Explanation: Batch file uses Unix line endings (LF-only) which can cause GOTO/CALL label parsing failures and script malfunction due to Windows batch parser 512-byte boundary bugs. Recommendation: Convert file to Windows line endings (CRLF). Use tools like dos2unix, notepad++, or configure git with 'git config core.autocrlf true'. Context: File uses Unix line endings (LF-only) - 24 LF sequences found

(E018)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd` around lines 1 -
24, The batch script membrane.cmd currently uses LF-only line endings which can
fail on Windows; open the membrane.cmd file (look for labels like :search_up,
:found, :notfound and calls to scripts\run-membrane.cmd) and convert/save the
file with CRLF (\r\n) line endings so Windows cmd.exe interprets it correctly
before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants