Skip to content

[Feature]: Enhanced parameter control for sidecar llama-server deployment to support lorebook semantic search. #2863

Description

@hendrix2222

What problem does this solve?

Out of the box, you cannot use the Gemma 4 sidecar for lorebook embedding/vectorization. Two issues arise:

  1. The default sidecar pooling type is 'none':
"type":"Error","message":"Local sidecar embedding request failed (400): Pooling type 'none' is not OAI compatible. Please use a different pooling type"
  1. The default sidecar batch size is 512:
E srv    send_error: task id = 5, error: input (660 tokens) is too large to process. increase the physical batch size (current batch size: 512)

This error can occur when entries being vectorized exceed a certain size.

Both of these issues can be resolve by adjusting the command line parameters passed to llama-server that is backing the sidecar model.

--pooling mean --ubatch 1024

However, these parameters are currently passed as hardcoded values in the sidecar launch plan:
args.push("--embeddings", "--pooling", "none");

Proposed solution

Provide additional configuration items that can override the default values above when starting the sidecar llama-server

Alternatives considered

Provide ability to select overrides in the sidecar configuration window.

Additional context

Errors above can be triggered by:

  1. Create a lorebook with n number of entries.
  2. Use the lorebook configuration window to enable semantic search and select the sidecar in the available model drop down.
  3. Click the button to vectorize the entries in the lorebook.

Template check

Please uncheck (untick) the box below before submitting so we know you read the template. It is intentionally pre-checked:

  • I DID NOT read this template and provide the requested details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions