feat: Inline buffer for captured replies#7648
Conversation
dc6aa34 to
54d1225
Compare
| // Stores the full undivided command. | ||
| CmdArgList full_args_; | ||
| absl::InlinedVector<std::string_view, 4> full_args_; | ||
|
|
There was a problem hiding this comment.
This is a big tradeoff - transaction still uses ArgSlice instead of ParsedArgs, so it needs an contiguous array - I moved it here from CommandContext to save 24 bytes there (tail_args_backing_).
This might slow down single commant path
There was a problem hiding this comment.
I want to get rid of tail_args_backing_ as well. Lets do it right
| class BackedArguments { | ||
| constexpr static size_t kLenCap = 5; | ||
| constexpr static size_t kStorageCap = 88; | ||
| constexpr static size_t kStorageCap = 128; |
There was a problem hiding this comment.
we got this by removing two fields + growing to hit mi_good_size of 320
| // Some commands might include arguments in replies, so we have a limited set | ||
| if (dispatched.cmd_cntx && dispatched.cid->SupportsAsync()) | ||
| crb.ProvideInlineBuffer(dispatched.cmd_cntx->GetInlineBuffer()); | ||
| else | ||
| crb.ProvideInlineBuffer({}); // reset buffer | ||
|
|
There was a problem hiding this comment.
or add some flag to CommandId instead?
PR Summary by Qodofeat: Inline buffer for captured replies in multi-command squasher Description
Diagram
High-Level Assessment
Files changed (15)
|
🤖 Augment PR SummarySummary: This PR reduces allocations when capturing command replies by reusing inline storage from parsed command arguments, while also shrinking/moving memory within command-context structures. Changes:
Technical notes: Captured replies may now reference caller-provided buffers, so correctness depends on buffer lifetime and accurate queue-byte adjustments when command storage mutates. 🤖 Was this summary useful? React with 👍 or 👎 |
| // Inlined buffer where the command arguments are stored. | ||
| // Can be used as a local buffer to store captured replies | ||
| std::span<char> GetInlineBuffer() { | ||
| return {storage_.begin(), std::max(kStorageCap, storage_.size())}; |
There was a problem hiding this comment.
GetInlineBuffer() returns a span sized to max(kStorageCap, storage_.size()), which can exceed storage_.size(); writing into that span (as done by reply capture) writes past the vector's constructed elements and can violate container/object-lifetime rules (UB under sanitizers).
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| "Return rounded down integers instead of floats for lua scripts with RESP2"); | ||
| ABSL_FLAG(uint32_t, multi_eval_squash_buffer, 4096, "Max buffer for squashed commands per script"); | ||
| ABSL_FLAG(uint32_t, multi_eval_squash_buffer, 8096, "Max buffer for squashed commands per script"); | ||
|
|
There was a problem hiding this comment.
Code Review by Qodo
1. Queue bytes delta wraps
|
| void Connection::AdjustParsedCmdBytes(ssize_t delta) { | ||
| if (parsed_cmd_q_bytes_ == 0) | ||
| return; // command dispatched synchronously, not in pipeline queue | ||
| auto& conn_stats = tl_facade_stats->conn_stats; | ||
| DCHECK_GE(static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta, 0); | ||
| DCHECK_GE(static_cast<ssize_t>(conn_stats.pipeline_queue_bytes) + delta, 0); | ||
| parsed_cmd_q_bytes_ += delta; | ||
| conn_stats.pipeline_queue_bytes += delta; | ||
| } |
There was a problem hiding this comment.
1. Queue bytes delta wraps 🐞 Bug ≡ Correctness
Connection::AdjustParsedCmdBytes applies a signed delta to size_t counters via "+=", so negative deltas will convert/wrap and corrupt pipeline queue byte accounting. The only caller also computes the delta as (after - before) using size_t, which underflows when memory shrinks and passes the wrong value downstream.
Agent Prompt
### Issue description
`Connection::AdjustParsedCmdBytes(ssize_t delta)` updates `size_t` counters with `+= delta`, which performs unsigned arithmetic and will wrap for negative deltas. Additionally, `StoreInMultiBlock` passes `after - before` where both are `size_t`, so when `after < before` it underflows before it ever reaches `AdjustParsedCmdBytes`.
### Issue Context
This breaks `pipeline_queue_bytes` / `parsed_cmd_q_bytes_` tracking and can cause backpressure decisions, throttling, and stats to be wrong.
### Fix Focus Areas
- src/facade/dragonfly_connection.cc[2790-2798]
- src/server/main_service.cc[887-895]
### Suggested fix
- In `StoreInMultiBlock`, compute the delta in signed space, e.g. `ssize_t delta = static_cast<ssize_t>(after) - static_cast<ssize_t>(before);`.
- In `AdjustParsedCmdBytes`, apply the delta using a signed intermediate and then cast back:
- `auto new_bytes = static_cast<ssize_t>(parsed_cmd_q_bytes_) + delta;` (DCHECK >= 0)
- `parsed_cmd_q_bytes_ = static_cast<size_t>(new_bytes);`
- same for `conn_stats.pipeline_queue_bytes`.
- Consider also DCHECKing that `delta` fits into the signed range you use (or accept `int64_t`).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| void Transaction::InitBase(Namespace* ns, DbIndex dbid, CmdArgList args) { | ||
| global_ = false; | ||
| db_index_ = dbid; | ||
| full_args_ = args; | ||
| full_args_ = {args.begin(), args.end()}; | ||
| local_result_ = OpStatus::OK; |
There was a problem hiding this comment.
2. Full_args_ init-list bug 🐞 Bug ≡ Correctness
Transaction::InitBase assigns absl::InlinedVector<std::string_view, 4> full_args_ using
full_args_ = {args.begin(), args.end()};, which is braced list-initialization rather than an
iterator-range copy. This is very likely a compile error (iterators aren’t string_view) and, if it
compiled, would not copy the full argument list needed for key detection and journaling.
Agent Prompt
### Issue description
`full_args_` was changed from `CmdArgList` (span) to `absl::InlinedVector<std::string_view, 4>`, but `InitBase` now assigns it with `{args.begin(), args.end()}` which is initializer-list syntax, not a range copy.
### Issue Context
`full_args_` is later used for key/shard derivation and for building journal entry payloads; it must contain *all* command arguments.
### Fix Focus Areas
- src/server/transaction.cc[176-190]
- src/server/transaction.h[608-612]
### Suggested fix
Replace the assignment with an iterator-range copy, e.g.:
- `full_args_.assign(args.begin(), args.end());`
(or `full_args_ = absl::InlinedVector<std::string_view, 4>(args.begin(), args.end());` if that constructor is preferred/available).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| // Inlined buffer where the command arguments are stored. | ||
| // Can be used as a local buffer to store captured replies | ||
| std::span<char> GetInlineBuffer() { | ||
| return {storage_.begin(), std::max(kStorageCap, storage_.size())}; | ||
| } |
There was a problem hiding this comment.
3. Inline buffer overwrites args 🐞 Bug ≡ Correctness
BackedArguments::GetInlineBuffer exposes a span starting at storage_.begin(), but storage_ contains the serialized command argument bytes; CapturingReplyBuilder::SendBulkString writes reply bytes into this span and stores a string_view (BulkStringRef) into the payload. This can overwrite command arguments (and any string_views referencing them, e.g. transaction/journaling), and if the reply string_view aliases the same buffer region the memcpy can have overlap UB.
Agent Prompt
### Issue description
`GetInlineBuffer()` returns a span starting at the beginning of `storage_`, but `storage_` is the backing store for command arguments. Using it as a reply scratch buffer overwrites argument bytes and can invalidate any downstream `std::string_view` references to args.
### Issue Context
`MultiCommandSquasher` passes this buffer into `CapturingReplyBuilder`, and `CapturingReplyBuilder::SendBulkString` writes into the buffer and captures a `BulkStringRef` view into it.
### Fix Focus Areas
- src/common/backed_args.h[133-137]
- src/server/multi_command_squasher.cc[242-248]
- src/facade/reply_capture.cc[53-61]
### Suggested fix
- Change `GetInlineBuffer()` to return only the **unused tail** of the backing storage (spare capacity), not the bytes that currently contain arguments. For example:
- `auto* base = storage_.data();`
- `size_t used = storage_.size();`
- `size_t cap = storage_.capacity();`
- `return std::span<char>(base + used, cap - used);`
- With this change, reply writes won’t overwrite args and also avoid any overlap with `str.data()` when `str` originates from the args buffer.
- If any remaining aliasing is still possible, switch from `memcpy` to `memmove` or guard against overlap explicitly.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
84938d9 to
89949e0
Compare
89949e0 to
9d75678
Compare
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io> fixes Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io> perf improvements + sneaky buffer Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io> fix? raise per script squashing level some basic limitations Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
9d75678 to
5e79dd6
Compare
Results: +8% in best case