Skip to content

feat(spider-task-executor): Add executor binary with bincode wire protocol and integration tests.#325

Open
LinZhihao-723 wants to merge 3 commits into
y-scope:mainfrom
LinZhihao-723:task-executor-impl
Open

feat(spider-task-executor): Add executor binary with bincode wire protocol and integration tests.#325
LinZhihao-723 wants to merge 3 commits into
y-scope:mainfrom
LinZhihao-723:task-executor-impl

Conversation

@LinZhihao-723
Copy link
Copy Markdown
Member

@LinZhihao-723 LinZhihao-723 commented May 14, 2026

Description

Wire protocol (spider-task-executor/src/protocol.rs)

A new protocol module on the spider-task-executor library defines the three wire types the execution manager will use to drive the child:

  • Request::Execute { tdl_context, raw_ctx, raw_inputs } and Request::Shutdown. The parent owns the hard timeout in its entirety; the executor has no notion of timeouts and the request carries no deadline.
  • Response::Result { outcome, elapsed_us } — exactly one per Execute. elapsed_us is the in-FFI wall-clock measured by the executor and is what the overhead instrument uses to separate executor-side cost from parent-side IPC.
  • ExecutorOutcome::Success { outputs } | Failure { error }outputs is the wire-format TaskOutputsSerializer buffer ready to forward to storage; error is the msgpack-encoded ExecutorError.

Stderr is not carried over the protocol; how the spawner disposes of the executor's stderr (inherit / pipe / log file) is a parent-side decision.

Executor binary (spider-task-executor/src/bin/spider_task_executor.rs)

Single-threaded tokio runtime; requests are processed strictly sequentially with exactly one task running for the lifetime of the process. Tokio is here only to match the async I/O surface the execution manager uses (tokio_util::codec::LengthDelimitedCodec); the executor itself has no concurrency requirements.

Key shape decisions:

  • The FFI call runs inline on the runtime thread — no second OS thread, no oneshot, no tokio::select!. The previous design dispatched the FFI on a std::thread so the runtime could select! an in-process timer against the FFI; with the timer responsibility consolidated on the parent, none of that scaffolding earns its keep.
  • SPIDER_TDL_PACKAGE_DIR is validated once at startup. If unset the binary exits non-zero before processing any request, which surfaces a deployment misconfiguration immediately rather than per-request.
  • Package resolution: ${SPIDER_TDL_PACKAGE_DIR}/<package>/lib<package>.so. The first request for a package dlopens the library; subsequent requests reuse the cached TdlPackage.
  • Tracing init: JSON, ANSI off, env-filtered, written to stderr so it doesn't pollute the framed-stdout protocol channel. Both tracing and tracing-subscriber are pulled in with default-features = false and only the features actually used (std for the macros; fmt, env-filter, json for the subscriber).

ExecutorError is now wire-friendly

ExecutorError derives serde::Serialize/Deserialize so the binary can ship it across the protocol as Failure { error: rmp_serde::to_vec(&err) } and the EM can decode it back to a typed value. Three variants used to wrap external types that don't implement Serialize (libloading::Error, std::str::Utf8Error, rmp_serde::decode::Error); they now carry the Display rendering of the source error as a String. Explicit From impls preserve the lossless ? propagation in manager.rs. The wildcard matches!(err, ExecutorError::InvalidLibrary(_)) pattern in the existing huntsman/tdl-integration tests still compiles unchanged.

Integration test crates

tests/huntsman/integration-test-tasks (TDL package)

A cdylib + rlib package (crate-type = ["cdylib", "rlib"]) registered under TDL package name integration_test_tasks. The dual crate-type lets the bench reference compile-time constants (notably INSTRUMENT_SLEEP_US) while keeping the cdylib for dlopen from the executor.

Tasks:

  • fibonacci(index: u64) -> u64 — naive recursion; correctness check.
  • always_fail() -> Result — returns TdlError::ExecutionError.
  • always_panic() -> ! — panics; the panic crosses the extern "C" FFI boundary and aborts the process, which is exactly the crash signal the parent test asserts on.
  • instrument(items: Vec<String>) -> Vec<String> — sleeps for a fixed INSTRUMENT_SLEEP_US (50µs) and echoes the payload back. Used by the overhead bench.

tests/huntsman/task-executor (executor integration tests)

A library crate that provides an ExecutorHandle harness (spawn the binary, frame requests on stdin, decode responses from stdout) and two [[test]] binaries.

The harness panics with descriptive .expect(...) messages on protocol / I/O / decode failures rather than threading errors through every helper — these are infrastructure, not production code, and a panic with backtrace points at the failure site immediately. (This pattern surfaced one subtle bug during development: a stale .so left over from a task rename produced TaskNotFound("instrument") in the panic message verbatim, which was the entire diagnosis.)

tests/executor.rs covers:

  • fibonacci_returns_correct_value — encodes a single u64 input; asserts Success and that the decoded u64 equals 55.
  • always_fail_reports_task_error — asserts Failure whose msgpack-decoded ExecutorError is TaskError(TdlError::ExecutionError(_)) and whose message contains the task name.
  • always_panic_crashes_the_process — sends Execute, expects stdout EOF before any frame arrives, then waits for the child to exit non-zero.

tests/overhead_instrument.rs runs the instrument task ten times against a long-lived executor (so dlopen happens once during a discarded warm-up) and writes a markdown table at ${SPIDER_TEST_INSTRUMENT_OUTPUT_DIR}/task_executor_overhead.md. With the work portion held constant at 50µs the table separates four metrics:

Metric What it measures
E2E (parent) Instant-to-Instant around send(Execute)recv(Response::Result)
Executor FFI elapsed_us reported by the executor (sleep + in-FFI input/output serde)
Executor internal (FFI - sleep) In-executor input/output serde alone
IPC overhead (E2E - FFI) Parent-side framing + bincode + pipe traversal

Sample output from a local run:

| Metric                          | Count | Avg (µs) | P50 (µs) | P95 (µs) | P99 (µs) |
| E2E (parent)                    | 10    | 476.00   | 479.64   | 602.06   | 602.06   |
| Executor FFI                    | 10    | 157.80   | 164.00   | 175.00   | 175.00   |
| Executor internal (FFI - sleep) | 10    | 107.80   | 114.00   | 125.00   | 125.00   |
| IPC overhead (E2E - FFI)        | 10    | 318.20   | 315.64   | 429.06   | 429.06   |

Taskfile changes (taskfiles/test.yaml)

spider-huntsman-unit-tests-executor now:

  1. Builds three artifacts in separate cargo build invocations (combining --package <cdylib> with --bin <name> would silently exclude the cdylibs from the target selection): huntsman-complex, integration-test-tasks, and spider-task-executor --bin spider-task-executor.
  2. Stages the cdylibs under build/tdl_packages/<package>/lib<package>.so — the standard layout the executor binary reads via ${SPIDER_TDL_PACKAGE_DIR}/<package>/lib<package>.so.
  3. Sets SPIDER_TDL_PACKAGE_DIR, SPIDER_TASK_EXECUTOR_BIN, and (relocates) SPIDER_TDL_PACKAGE_COMPLEX to point at the staged paths. Existing huntsman-complex consumers see no behavior change because their tests load the .so by absolute path.
  4. Invokes cargo nextest run --all --all-features --run-ignored all --release. The new tests are gated #[ignore] so plain cargo test (which doesn't go through the taskfile) can't accidentally try to run them without env vars set.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Ensure all workflows pass.
  • Add integration tests to test task executions that:
    • Return results.
    • Return errors.
    • Panic/crash.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a subprocess-based task executor binary for handling task execution with inter-process communication.
    • Introduced wire protocol for process-to-process request/response messaging.
  • Tests

    • Added comprehensive integration test suite with end-to-end validation for task execution.
    • Added performance instrumentation tests to measure executor overhead and latency metrics.
    • Created test task package with multiple validation scenarios including error handling and panic recovery.

Review Change Stack

@LinZhihao-723 LinZhihao-723 requested review from a team and sitaowang1998 as code owners May 14, 2026 23:44
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Walkthrough

This PR introduces a new spider-task-executor subprocess that executes TDL tasks via framed bincode IPC, with serializable error types, dynamic package loading, test task definitions, and a complete integration test suite with latency benchmarking.

Changes

Task Executor Subprocess

Layer / File(s) Summary
Protocol and error serialization foundation
components/spider-task-executor/src/protocol.rs, components/spider-task-executor/src/error.rs, components/spider-task-executor/src/lib.rs
Request/Response/ExecutorOutcome enums define the framed bincode wire protocol. ExecutorError gains Serialize/Deserialize by storing String messages instead of wrapped error types, with explicit From implementations converting libloading::Error, Utf8Error, and rmp_serde::decode::Error.
Executor binary implementation
components/spider-task-executor/src/bin/spider_task_executor.rs
Boots single-threaded Tokio runtime, reads length-delimited bincode requests from stdin, dispatches Execute by loading TDL packages from cache or filesystem, measures elapsed microseconds, and sends length-delimited Response frames to stdout with JSON-formatted tracing to stderr.
Package manager API update
components/spider-task-executor/src/manager.rs, tests/huntsman/tdl-integration/tests/complex.rs
TdlPackageManager::load now returns a reference to the loaded TdlPackage instead of the package name string. TdlPackage gains Debug derive. Existing test updated to use pkg.name() instead of comparing the loaded value.
Workspace and build configuration
Cargo.toml, components/spider-task-executor/Cargo.toml, taskfiles/test.yaml
Adds integration-test-tasks and task-executor test crate to workspace. Declares spider-task-executor binary target with runtime dependencies. Test task now builds multiple packages, stages cdylib artifacts into TDL package layout under ${G_BUILD_DIR}/tdl_packages, and exports executor binary and package directory paths.
Integration test harness infrastructure
tests/huntsman/task-executor/Cargo.toml, tests/huntsman/task-executor/src/lib.rs
ExecutorHandle spawns executor subprocess with piped stdio, sends/receives length-delimited framed bincode messages. Provides environment-based helpers to locate executor binary and package directory, and wire-format helpers for building task context, encoding/decoding inputs and outputs, and constructing execute requests.
Integration test task definitions
tests/huntsman/integration-test-tasks/Cargo.toml, tests/huntsman/integration-test-tasks/src/lib.rs
Defines integration_test_tasks TDL package with four test tasks: fibonacci (CPU-bound recursion), always_fail (task error), always_panic (FFI boundary panic), and instrument (fixed-duration sleep then echo payload).
Integration test cases and benchmarking
tests/huntsman/task-executor/tests/executor.rs, tests/huntsman/task-executor/tests/overhead_instrument.rs
Three functional tests validate fibonacci correctness, error decoding, and panic crash semantics. Ignored benchmark measures steady-state overhead (E2E latency, FFI time, executor-internal overhead, IPC overhead) across multiple iterations and writes Markdown metrics report.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • y-scope/spider#317: Introduced the executor and package-loading foundation that this PR extends with subprocess IPC, serializable errors, and comprehensive integration testing.

Suggested reviewers

  • sitaowang1998
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(spider-task-executor): Add executor binary with bincode wire protocol and integration tests' accurately and comprehensively describes the main changes: introduction of an executor binary, wire protocol implementation, and integration tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/huntsman/integration-test-tasks/src/lib.rs (1)

17-17: 💤 Low value

Consider the reliability of a 50-microsecond sleep for benchmarking.

The INSTRUMENT_SLEEP_US constant sets a 50-microsecond sleep, which is quite short. On Linux, sleep() with sub-millisecond durations may be subject to scheduler granularity and could have higher variance. Since this is used for overhead measurement in benchmark tests, ensure the results account for potential timing imprecision.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/huntsman/integration-test-tasks/src/lib.rs` at line 17, The 50µs
constant INSTRUMENT_SLEEP_US is too short and may suffer scheduler jitter; to
fix, increase it to a more reliable value (e.g., 1000 or 5000 µs) or make it
configurable so tests can select a stable duration (via an environment variable
or test flag) and update any places using INSTRUMENT_SLEEP_US to read the
configurable value; ensure the constant name and usages (INSTRUMENT_SLEEP_US)
are adjusted and document the change in the test notes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/huntsman/integration-test-tasks/src/lib.rs`:
- Line 17: The 50µs constant INSTRUMENT_SLEEP_US is too short and may suffer
scheduler jitter; to fix, increase it to a more reliable value (e.g., 1000 or
5000 µs) or make it configurable so tests can select a stable duration (via an
environment variable or test flag) and update any places using
INSTRUMENT_SLEEP_US to read the configurable value; ensure the constant name and
usages (INSTRUMENT_SLEEP_US) are adjusted and document the change in the test
notes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8d75d100-9dd1-4402-8d2a-b879ab41f55d

📥 Commits

Reviewing files that changed from the base of the PR and between aadb9eb and 777cff5.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (15)
  • Cargo.toml
  • components/spider-task-executor/Cargo.toml
  • components/spider-task-executor/src/bin/spider_task_executor.rs
  • components/spider-task-executor/src/error.rs
  • components/spider-task-executor/src/lib.rs
  • components/spider-task-executor/src/manager.rs
  • components/spider-task-executor/src/protocol.rs
  • taskfiles/test.yaml
  • tests/huntsman/integration-test-tasks/Cargo.toml
  • tests/huntsman/integration-test-tasks/src/lib.rs
  • tests/huntsman/task-executor/Cargo.toml
  • tests/huntsman/task-executor/src/lib.rs
  • tests/huntsman/task-executor/tests/executor.rs
  • tests/huntsman/task-executor/tests/overhead_instrument.rs
  • tests/huntsman/tdl-integration/tests/complex.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant