Skip to content

feat: Inline buffer for captured replies#7648

Open
dranikpg wants to merge 2 commits into
dragonflydb:mainfrom
dranikpg:opt-command-context
Open

feat: Inline buffer for captured replies#7648
dranikpg wants to merge 2 commits into
dragonflydb:mainfrom
dranikpg:opt-command-context

Conversation

@dranikpg

@dranikpg dranikpg commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Results: +8% in best case

image
  1. Reduce size of CommandContext to get more free space
  2. Allow using storage_ from backed arguments to save reply strings to reduce allocations

@dranikpg dranikpg force-pushed the opt-command-context branch from dc6aa34 to 54d1225 Compare June 19, 2026 08:47
@dranikpg dranikpg changed the title fix: Optimize CommandContext size feat: Inline buffer for captured replies Jun 19, 2026
Comment thread src/server/transaction.h Outdated
Comment on lines 610 to 612
// Stores the full undivided command.
CmdArgList full_args_;
absl::InlinedVector<std::string_view, 4> full_args_;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big tradeoff - transaction still uses ArgSlice instead of ParsedArgs, so it needs an contiguous array - I moved it here from CommandContext to save 24 bytes there (tail_args_backing_).

This might slow down single commant path

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so lets solve it first.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to get rid of tail_args_backing_ as well. Lets do it right

Comment thread src/common/backed_args.h
class BackedArguments {
constexpr static size_t kLenCap = 5;
constexpr static size_t kStorageCap = 88;
constexpr static size_t kStorageCap = 128;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we got this by removing two fields + growing to hit mi_good_size of 320

Comment on lines +243 to +248
// Some commands might include arguments in replies, so we have a limited set
if (dispatched.cmd_cntx && dispatched.cid->SupportsAsync())
crb.ProvideInlineBuffer(dispatched.cmd_cntx->GetInlineBuffer());
else
crb.ProvideInlineBuffer({}); // reset buffer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or add some flag to CommandId instead?

@dranikpg dranikpg requested a review from romange June 19, 2026 16:38
@dranikpg dranikpg marked this pull request as ready for review June 19, 2026 16:38
@qodo-free-for-open-source-projects

Copy link
Copy Markdown

PR Summary by Qodo

feat: Inline buffer for captured replies in multi-command squasher
✨ Enhancement 🐞 Bug fix 🕐 40+ Minutes

Grey Divider

Description

• Adds BulkStringRef payload type and GetInlineBuffer()/ProvideInlineBuffer() APIs so
 CapturingReplyBuilder can store short bulk-string replies directly into BackedArguments' inlined
 storage, avoiding heap allocations during MULTI/EXEC squashing.
• Reduces CommandContext size to fit the 320-byte mi_good_size boundary by removing the
 arg_slice_backing field and the cached enqueued_bytes_ field from ParsedCommand, replacing
 them with stack-local storage.
• Fixes pipeline-queue byte accounting when StoreInMultiBlock calls SwapArgs (which mutates heap
 memory of a queued ParsedCommand) by introducing Connection::AdjustParsedCmdBytes.
• Changes Transaction::full_args_ from a raw CmdArgList span to an absl::InlinedVector so it
 owns its own copy of argument views, preventing dangling references after SwapArgs.
• Raises the multi_eval_squash_buffer default from 4096 to 8096 bytes to accommodate the larger
 inline buffer.
Diagram

graph TD
    A["BackedArguments\n(kStorageCap=128)"] -->|GetInlineBuffer| B["CapturingReplyBuilder"]
    B -->|ProvideInlineBuffer| C["MultiCommandSquasher\nSquashedHopCb"]
    B -->|BulkStringRef payload| D["reply_payload.h\nPayload variant"]
    D -->|CaptureVisitor| E["reply_capture.cc"]
    D -->|CaptureVisitor| F["http_api.cc"]
    G["CommandContext\n(320 bytes)"] -->|inherits| A
    G -->|GetInlineBuffer| C
    H["Connection"] -->|AdjustParsedCmdBytes| I["Pipeline Queue\nByte Accounting"]
    J["main_service.cc\nStoreInMultiBlock"] -->|delta| H
    K["Transaction\nfull_args_ InlinedVector"] -->|owns copy| G

    subgraph Legend
      direction LR
      _mod["Module/File"] ~~~ _data[("Data Store")]
    end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Per-shard reply arena in MultiCommandSquasher
  • ➕ Cleaner separation of concerns — reply storage is not mixed with argument storage
  • ➕ Easier to reason about lifetime (arena lives exactly as long as the squash hop)
  • ➖ Requires allocating a new arena per squash hop, adding overhead
  • ➖ Does not reduce CommandContext size (the primary goal of this PR)
2. Pool allocator for BulkString in CapturingReplyBuilder
  • ➕ More general — benefits all capturing paths, not just squashing
  • ➖ More complex to implement correctly with thread safety
  • ➖ Does not address the CommandContext size reduction goal

Recommendation: The PR's approach of reusing the command's own inlined storage as a reply buffer is optimal for the squashing hot path — it avoids heap allocation without adding new memory. The main alternative considered below (a separate per-shard arena) would be cleaner but adds complexity and memory overhead.

Files changed (15) +85 / -45

Enhancement (6) +45 / -9
backed_args.hExpand inline storage cap to 128 bytes and expose GetInlineBuffer() +8/-4

Expand inline storage cap to 128 bytes and expose GetInlineBuffer()

• Increases 'kStorageCap' from 88 to 128 bytes to align with the new 'CommandContext' size budget. Adds 'GetInlineBuffer()' which returns a span over the inlined storage region, allowing callers to reuse it as a scratch buffer for captured reply strings. Removes the now-outdated 'static_assert(sizeof(BackedArguments) == 128)'.

src/common/backed_args.h

dragonfly_connection.hDeclare AdjustParsedCmdBytes() on Connection +5/-0

Declare AdjustParsedCmdBytes() on Connection

• Adds the public 'AdjustParsedCmdBytes(ssize_t delta)' method declaration with a doc comment explaining its purpose.

src/facade/dragonfly_connection.h

reply_capture.ccUse inline buffer in SendBulkString to avoid heap allocation +11/-1

Use inline buffer in SendBulkString to avoid heap allocation

• When an inline buffer is provided and the string fits (12–buffer_size bytes), 'SendBulkString' copies the data into the inline buffer and captures a 'BulkStringRef' (non-owning view) instead of allocating a 'BulkString'. The buffer span is advanced after each use. Adds a 'CaptureVisitor' overload for 'BulkStringRef'.

src/facade/reply_capture.cc

reply_capture.hAdd ProvideInlineBuffer() and inline_buffer_ member to CapturingReplyBuilder +8/-0

Add ProvideInlineBuffer() and inline_buffer_ member to CapturingReplyBuilder

• Adds 'ProvideInlineBuffer(std::span<char>)' to let callers supply a pre-allocated buffer. Stores it as 'inline_buffer_' which 'SendBulkString' consumes to avoid per-reply heap allocations.

src/facade/reply_capture.h

reply_payload.hAdd BulkStringRef payload variant for zero-copy bulk string capture +6/-4

Add BulkStringRef payload variant for zero-copy bulk string capture

• Introduces 'BulkStringRef' (a 'std::string_view' subtype) as a new 'Payload' variant representing a bulk string whose lifetime is guaranteed externally. Adds it to the 'Payload' variant type.

src/facade/reply_payload.h

multi_command_squasher.ccProvide inline buffer to CapturingReplyBuilder during squashed dispatch +7/-0

Provide inline buffer to CapturingReplyBuilder during squashed dispatch

• For async-capable commands with a 'cmd_cntx', passes 'GetInlineBuffer()' to 'crb.ProvideInlineBuffer()' so captured bulk-string replies can be stored in the command's own inlined storage. Resets the buffer for non-async commands.

src/server/multi_command_squasher.cc

Bug fix (5) +32 / -17
dragonfly_connection.ccReplace EnqueuedBytes() with UsedMemory() and add AdjustParsedCmdBytes() +12/-5

Replace EnqueuedBytes() with UsedMemory() and add AdjustParsedCmdBytes()

• Switches 'EnqueueParsedCommand' and 'ReleaseParsedCommand' to use 'UsedMemory()' directly instead of the now-removed cached 'enqueued_bytes_' snapshot. Adds 'AdjustParsedCmdBytes(ssize_t delta)' to correct pipeline-queue byte counters when a queued command's backing storage is mutated in-place (e.g. during MULTI/EXEC collection via 'SwapArgs').

src/facade/dragonfly_connection.cc

http_api.ccHandle BulkStringRef in HTTP CaptureVisitor +5/-0

Handle BulkStringRef in HTTP CaptureVisitor

• Adds a 'BulkStringRef' overload to the HTTP 'CaptureVisitor' so JSON serialization handles the new payload variant correctly.

src/server/http_api.cc

main_service.ccFix pipeline accounting on SwapArgs and simplify tail_args lifetime +13/-10

Fix pipeline accounting on SwapArgs and simplify tail_args lifetime

• Measures 'UsedMemory()' before and after 'StoredCmd' construction (which calls 'SwapArgs') and calls 'AdjustParsedCmdBytes' to keep pipeline-queue byte counters accurate. Simplifies 'DispatchCommand' by using a stack-local 'tail_args_backing' instead of storing it in 'CommandContext'.

src/server/main_service.cc

transaction.ccCopy args into full_args_ InlinedVector instead of storing a span +1/-1

Copy args into full_args_ InlinedVector instead of storing a span

• Changes 'full_args_ = args' (shallow span assignment) to 'full_args_ = {args.begin(), args.end()}' (deep copy into the 'InlinedVector'), preventing dangling references after 'SwapArgs' moves storage out of the originating 'ParsedCommand'.

src/server/transaction.cc

transaction.hChange full_args_ from CmdArgList span to InlinedVector<string_view> +1/-1

Change full_args_ from CmdArgList span to InlinedVector<string_view>

• Replaces 'CmdArgList full_args_' (a non-owning span) with 'absl::InlinedVector<std::string_view, 4> full_args_' so the transaction owns its argument views independently of the originating 'ParsedCommand''s storage.

src/server/transaction.h

Refactor (4) +8 / -19
parsed_command.hRemove cached enqueued_bytes_ field and simplify FinalizeParsing() +0/-11

Remove cached enqueued_bytes_ field and simplify FinalizeParsing()

• Removes the 'enqueued_bytes_' member and 'EnqueuedBytes()' accessor that cached the memory size at enqueue time. 'FinalizeParsing()' now only records the parse cycle timestamp. Removes the Linux-specific 'static_assert' on 'sizeof(ParsedCommand)'.

src/facade/parsed_command.h

command_registry.hMark args parameter unused in SetAsyncHandler lambda +3/-1

Mark args parameter unused in SetAsyncHandler lambda

• Annotates the 'args' parameter in the 'SetAsyncHandler' lambda as '/* unused */' since async handlers receive arguments via 'MakeParserFromContext' instead.

src/server/command_registry.h

conn_context.ccRemove arg_slice_backing.clear() from ReuseInternal() +0/-1

Remove arg_slice_backing.clear() from ReuseInternal()

• Drops the 'arg_slice_backing.clear()' call since the field has been removed from 'CommandContext'.

src/server/conn_context.cc

conn_context.hRemove arg_slice_backing field and add CommandContext size assertion +5/-6

Remove arg_slice_backing field and add CommandContext size assertion

• Removes the 'arg_slice_backing' ('CmdArgVec') field that backed tail-arg slices for deferred commands. Updates the comment on 'tail_args_'. Adds 'static_assert(sizeof(CommandContext) == 320)' targeting the 320-byte 'mi_good_size' boundary.

src/server/conn_context.h

@augmentcode

augmentcode Bot commented Jun 19, 2026

Copy link
Copy Markdown
🤖 Augment PR Summary

Summary: This PR reduces allocations when capturing command replies by reusing inline storage from parsed command arguments, while also shrinking/moving memory within command-context structures.

Changes:

  • Increase BackedArguments inline storage and expose it via GetInlineBuffer() for temporary reply-string storage.
  • Extend reply payloads with BulkStringRef (a non-owning string_view) and teach reply visitors (Redis + HTTP) to replay it.
  • Update pipeline queue accounting to use UsedMemory() and add Connection::AdjustParsedCmdBytes() for cases where backing storage is moved (e.g. MULTI/EXEC collection).
  • Shrink CommandContext by removing the tail-args backing vector and enforcing a new size target (320 bytes).

Technical notes: Captured replies may now reference caller-provided buffers, so correctness depends on buffer lifetime and accurate queue-byte adjustments when command storage mutates.

🤖 Was this summary useful? React with 👍 or 👎

@augmentcode augmentcode Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/facade/dragonfly_connection.cc
Comment thread src/server/main_service.cc Outdated
Comment thread src/common/backed_args.h
// Inlined buffer where the command arguments are stored.
// Can be used as a local buffer to store captured replies
std::span<char> GetInlineBuffer() {
return {storage_.begin(), std::max(kStorageCap, storage_.size())};

@augmentcode augmentcode Bot Jun 19, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetInlineBuffer() returns a span sized to max(kStorageCap, storage_.size()), which can exceed storage_.size(); writing into that span (as done by reply capture) writes past the vector's constructed elements and can violate container/object-lifetime rules (UB under sanitizers).

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

"Return rounded down integers instead of floats for lua scripts with RESP2");
ABSL_FLAG(uint32_t, multi_eval_squash_buffer, 4096, "Max buffer for squashed commands per script");
ABSL_FLAG(uint32_t, multi_eval_squash_buffer, 8096, "Max buffer for squashed commands per script");

@augmentcode augmentcode Bot Jun 19, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new default multi_eval_squash_buffer value 8096 looks unusual (previously 4096); if the intent was to double to an 8KiB boundary, this might be an accidental typo.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Queue bytes delta wraps 🐞 Bug ≡ Correctness
Description
Connection::AdjustParsedCmdBytes applies a signed delta to size_t counters via "+=", so negative
deltas will convert/wrap and corrupt pipeline queue byte accounting. The only caller also computes
the delta as (after - before) using size_t, which underflows when memory shrinks and passes the
wrong value downstream.
Code

src/facade/dragonfly_connection.cc[R2790-2798]

+void Connection::AdjustParsedCmdBytes(ssize_t delta) {
+  if (parsed_cmd_q_bytes_ == 0)
+    return;  // command dispatched synchronously, not in pipeline queue
+  auto& conn_stats = tl_facade_stats->conn_stats;
+  DCHECK_GE(static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta, 0);
+  DCHECK_GE(static_cast<ssize_t>(conn_stats.pipeline_queue_bytes) + delta, 0);
+  parsed_cmd_q_bytes_ += delta;
+  conn_stats.pipeline_queue_bytes += delta;
+}
Evidence
The function takes ssize_t delta but adds it directly into size_t parsed_cmd_q_bytes_ and
size_t pipeline_queue_bytes, which will wrap for negative deltas due to unsigned arithmetic. The
caller passes after - before where both are size_t, so a shrink produces an underflowed large
positive delta.

src/facade/dragonfly_connection.cc[2790-2798]
src/facade/dragonfly_connection.h[574-579]
src/facade/facade_stats.h[15-25]
src/server/main_service.cc[881-895]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`Connection::AdjustParsedCmdBytes(ssize_t delta)` updates `size_t` counters with `+= delta`, which performs unsigned arithmetic and will wrap for negative deltas. Additionally, `StoreInMultiBlock` passes `after - before` where both are `size_t`, so when `after < before` it underflows before it ever reaches `AdjustParsedCmdBytes`.

### Issue Context
This breaks `pipeline_queue_bytes` / `parsed_cmd_q_bytes_` tracking and can cause backpressure decisions, throttling, and stats to be wrong.

### Fix Focus Areas
- src/facade/dragonfly_connection.cc[2790-2798]
- src/server/main_service.cc[887-895]

### Suggested fix
- In `StoreInMultiBlock`, compute the delta in signed space, e.g. `ssize_t delta = static_cast<ssize_t>(after) - static_cast<ssize_t>(before);`.
- In `AdjustParsedCmdBytes`, apply the delta using a signed intermediate and then cast back:
 - `auto new_bytes = static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta;` (DCHECK >= 0)
 - `parsed_cmd_q_bytes_ = static_cast<size_t>(new_bytes);`
 - same for `conn_stats.pipeline_queue_bytes`.
- Consider also DCHECKing that `delta` fits into the signed range you use (or accept `int64_t`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. full_args_ init-list bug 🐞 Bug ≡ Correctness
Description
Transaction::InitBase assigns absl::InlinedVector<std::string_view, 4> full_args_ using
full_args_ = {args.begin(), args.end()};, which is braced list-initialization rather than an
iterator-range copy. This is very likely a compile error (iterators aren’t string_view) and, if it
compiled, would not copy the full argument list needed for key detection and journaling.
Code

src/server/transaction.cc[R176-180]

void Transaction::InitBase(Namespace* ns, DbIndex dbid, CmdArgList args) {
  global_ = false;
  db_index_ = dbid;
-  full_args_ = args;
+  full_args_ = {args.begin(), args.end()};
  local_result_ = OpStatus::OK;
Evidence
CmdArgList is an absl::Span<const std::string_view> whose iterators are pointers; assigning
{args.begin(), args.end()} attempts list-initialization, not copying the range of args, and
full_args_ is later consumed for journaling.

src/server/transaction.h[601-612]
src/common/arg_range.h[15-18]
src/server/transaction.cc[176-181]
src/server/transaction.cc[1566-1573]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`full_args_` was changed from `CmdArgList` (span) to `absl::InlinedVector<std::string_view, 4>`, but `InitBase` now assigns it with `{args.begin(), args.end()}` which is initializer-list syntax, not a range copy.

### Issue Context
`full_args_` is later used for key/shard derivation and for building journal entry payloads; it must contain *all* command arguments.

### Fix Focus Areas
- src/server/transaction.cc[176-190]
- src/server/transaction.h[608-612]

### Suggested fix
Replace the assignment with an iterator-range copy, e.g.:
- `full_args_.assign(args.begin(), args.end());`

(or `full_args_ = absl::InlinedVector<std::string_view, 4>(args.begin(), args.end());` if that constructor is preferred/available).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Inline buffer overwrites args 🐞 Bug ≡ Correctness
Description
BackedArguments::GetInlineBuffer exposes a span starting at storage_.begin(), but storage_
contains the serialized command argument bytes; CapturingReplyBuilder::SendBulkString writes reply
bytes into this span and stores a string_view (BulkStringRef) into the payload. This can overwrite
command arguments (and any string_views referencing them, e.g. transaction/journaling), and if the
reply string_view aliases the same buffer region the memcpy can have overlap UB.
Code

src/common/backed_args.h[R133-137]

+  // Inlined buffer where the command arguments are stored.
+  // Can be used as a local buffer to store captured replies
+  std::span<char> GetInlineBuffer() {
+    return {storage_.begin(), std::max(kStorageCap, storage_.size())};
+  }
Evidence
storage_ is used to store argument bytes (Assign resizes and memcpy’s args into
storage_.data()), yet GetInlineBuffer() returns a span starting at the same beginning pointer.
That span is then written to by SendBulkString via memcpy and referenced by BulkStringRef, so
args and any views into them can be corrupted.

src/common/backed_args.h[146-168]
src/common/backed_args.h[133-137]
src/server/multi_command_squasher.cc[240-248]
src/facade/reply_capture.cc[53-61]
src/server/transaction.cc[1566-1573]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`GetInlineBuffer()` returns a span starting at the beginning of `storage_`, but `storage_` is the backing store for command arguments. Using it as a reply scratch buffer overwrites argument bytes and can invalidate any downstream `std::string_view` references to args.

### Issue Context
`MultiCommandSquasher` passes this buffer into `CapturingReplyBuilder`, and `CapturingReplyBuilder::SendBulkString` writes into the buffer and captures a `BulkStringRef` view into it.

### Fix Focus Areas
- src/common/backed_args.h[133-137]
- src/server/multi_command_squasher.cc[242-248]
- src/facade/reply_capture.cc[53-61]

### Suggested fix
- Change `GetInlineBuffer()` to return only the **unused tail** of the backing storage (spare capacity), not the bytes that currently contain arguments. For example:
 - `auto* base = storage_.data();`
 - `size_t used = storage_.size();`
 - `size_t cap = storage_.capacity();`
 - `return std::span<char>(base + used, cap - used);`
- With this change, reply writes won’t overwrite args and also avoid any overlap with `str.data()` when `str` originates from the args buffer.
- If any remaining aliasing is still possible, switch from `memcpy` to `memmove` or guard against overlap explicitly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. BulkStringRef size uncounted 🐞 Bug ☼ Reliability
Description
MultiCommandSquasher::Size() does not account for payload::BulkStringRef, so
squashing_current_reply_size underestimates captured reply sizes when BulkStringRef is used. This
weakens FLAGS_squashed_reply_size_limit enforcement in Connection::IsReplySizeOverLimit() and
can allow additional squashing when it should fall back to single-command dispatch.
Code

src/facade/reply_payload.h[R25-31]

+struct SimpleString : public std::string {};        // SendSimpleString
+struct BulkString : public std::string {};          // SendBulkString
+struct BulkStringRef : public std::string_view {};  // SendBulkString with guaranteed lifetime

-using Payload = std::variant<std::monostate, Null, Error, long, double, SimpleString, BulkString,
-                             cmn::BorrowedString, std::unique_ptr<CollectionPayload>>;
+using Payload =
+    std::variant<std::monostate, Null, Error, long, double, SimpleString, BulkString, BulkStringRef,
+                 cmn::BorrowedString, std::unique_ptr<CollectionPayload>>;
Evidence
The payload variant now includes BulkStringRef, but the Size visitor in MultiCommandSquasher
only counts BulkString/SimpleString and returns 0 for other types; the resulting counter is used
by Connection::IsReplySizeOverLimit() to decide whether to squash.

src/facade/reply_payload.h[25-31]
src/server/multi_command_squasher.cc[39-59]
src/facade/dragonfly_connection.cc[2485-2499]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
A new payload alternative `payload::BulkStringRef` was added, but `MultiCommandSquasher::Size()` doesn’t include it, so reply-size throttling undercounts replies captured via the inline buffer.

### Issue Context
`squashing_current_reply_size` is used to decide whether to squash more pipelined commands or fall back to single-command dispatch.

### Fix Focus Areas
- src/facade/reply_payload.h[24-31]
- src/server/multi_command_squasher.cc[39-59]

### Suggested fix
Add a `visit` case in `MultiCommandSquasher::Size()`:
- `[](const payload::BulkStringRef& data) { return data.size(); },`

This should be placed alongside the existing `BulkString`/`SimpleString` cases.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines +2790 to +2798
void Connection::AdjustParsedCmdBytes(ssize_t delta) {
if (parsed_cmd_q_bytes_ == 0)
return; // command dispatched synchronously, not in pipeline queue
auto& conn_stats = tl_facade_stats->conn_stats;
DCHECK_GE(static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta, 0);
DCHECK_GE(static_cast<ssize_t>(conn_stats.pipeline_queue_bytes) + delta, 0);
parsed_cmd_q_bytes_ += delta;
conn_stats.pipeline_queue_bytes += delta;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Queue bytes delta wraps 🐞 Bug ≡ Correctness

Connection::AdjustParsedCmdBytes applies a signed delta to size_t counters via "+=", so negative
deltas will convert/wrap and corrupt pipeline queue byte accounting. The only caller also computes
the delta as (after - before) using size_t, which underflows when memory shrinks and passes the
wrong value downstream.
Agent Prompt
### Issue description
`Connection::AdjustParsedCmdBytes(ssize_t delta)` updates `size_t` counters with `+= delta`, which performs unsigned arithmetic and will wrap for negative deltas. Additionally, `StoreInMultiBlock` passes `after - before` where both are `size_t`, so when `after < before` it underflows before it ever reaches `AdjustParsedCmdBytes`.

### Issue Context
This breaks `pipeline_queue_bytes` / `parsed_cmd_q_bytes_` tracking and can cause backpressure decisions, throttling, and stats to be wrong.

### Fix Focus Areas
- src/facade/dragonfly_connection.cc[2790-2798]
- src/server/main_service.cc[887-895]

### Suggested fix
- In `StoreInMultiBlock`, compute the delta in signed space, e.g. `ssize_t delta = static_cast<ssize_t>(after) - static_cast<ssize_t>(before);`.
- In `AdjustParsedCmdBytes`, apply the delta using a signed intermediate and then cast back:
  - `auto new_bytes = static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta;` (DCHECK >= 0)
  - `parsed_cmd_q_bytes_ = static_cast<size_t>(new_bytes);`
  - same for `conn_stats.pipeline_queue_bytes`.
- Consider also DCHECKing that `delta` fits into the signed range you use (or accept `int64_t`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/server/transaction.cc Outdated
Comment on lines 176 to 180
void Transaction::InitBase(Namespace* ns, DbIndex dbid, CmdArgList args) {
global_ = false;
db_index_ = dbid;
full_args_ = args;
full_args_ = {args.begin(), args.end()};
local_result_ = OpStatus::OK;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Full_args_ init-list bug 🐞 Bug ≡ Correctness

Transaction::InitBase assigns absl::InlinedVector<std::string_view, 4> full_args_ using
full_args_ = {args.begin(), args.end()};, which is braced list-initialization rather than an
iterator-range copy. This is very likely a compile error (iterators aren’t string_view) and, if it
compiled, would not copy the full argument list needed for key detection and journaling.
Agent Prompt
### Issue description
`full_args_` was changed from `CmdArgList` (span) to `absl::InlinedVector<std::string_view, 4>`, but `InitBase` now assigns it with `{args.begin(), args.end()}` which is initializer-list syntax, not a range copy.

### Issue Context
`full_args_` is later used for key/shard derivation and for building journal entry payloads; it must contain *all* command arguments.

### Fix Focus Areas
- src/server/transaction.cc[176-190]
- src/server/transaction.h[608-612]

### Suggested fix
Replace the assignment with an iterator-range copy, e.g.:
- `full_args_.assign(args.begin(), args.end());`

(or `full_args_ = absl::InlinedVector<std::string_view, 4>(args.begin(), args.end());` if that constructor is preferred/available).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/common/backed_args.h
Comment on lines +133 to +137
// Inlined buffer where the command arguments are stored.
// Can be used as a local buffer to store captured replies
std::span<char> GetInlineBuffer() {
return {storage_.begin(), std::max(kStorageCap, storage_.size())};
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Inline buffer overwrites args 🐞 Bug ≡ Correctness

BackedArguments::GetInlineBuffer exposes a span starting at storage_.begin(), but storage_
contains the serialized command argument bytes; CapturingReplyBuilder::SendBulkString writes reply
bytes into this span and stores a string_view (BulkStringRef) into the payload. This can overwrite
command arguments (and any string_views referencing them, e.g. transaction/journaling), and if the
reply string_view aliases the same buffer region the memcpy can have overlap UB.
Agent Prompt
### Issue description
`GetInlineBuffer()` returns a span starting at the beginning of `storage_`, but `storage_` is the backing store for command arguments. Using it as a reply scratch buffer overwrites argument bytes and can invalidate any downstream `std::string_view` references to args.

### Issue Context
`MultiCommandSquasher` passes this buffer into `CapturingReplyBuilder`, and `CapturingReplyBuilder::SendBulkString` writes into the buffer and captures a `BulkStringRef` view into it.

### Fix Focus Areas
- src/common/backed_args.h[133-137]
- src/server/multi_command_squasher.cc[242-248]
- src/facade/reply_capture.cc[53-61]

### Suggested fix
- Change `GetInlineBuffer()` to return only the **unused tail** of the backing storage (spare capacity), not the bytes that currently contain arguments. For example:
  - `auto* base = storage_.data();`
  - `size_t used = storage_.size();`
  - `size_t cap = storage_.capacity();`
  - `return std::span<char>(base + used, cap - used);`
- With this change, reply writes won’t overwrite args and also avoid any overlap with `str.data()` when `str` originates from the args buffer.
- If any remaining aliasing is still possible, switch from `memcpy` to `memmove` or guard against overlap explicitly.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@dranikpg dranikpg force-pushed the opt-command-context branch 2 times, most recently from 84938d9 to 89949e0 Compare June 30, 2026 16:48
@dranikpg dranikpg force-pushed the opt-command-context branch from 89949e0 to 9d75678 Compare July 1, 2026 17:16
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

fixes

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

perf improvements + sneaky buffer

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

fix?

raise per script squashing level

some basic limitations

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
@dranikpg dranikpg force-pushed the opt-command-context branch from 9d75678 to 5e79dd6 Compare July 1, 2026 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants