smite: Clean up managed subprocess groups on shutdown#32
smite: Clean up managed subprocess groups on shutdown#32Ashish-Kumar-Dash wants to merge 5 commits intomorehouse:masterfrom
Conversation
morehouse
left a comment
There was a problem hiding this comment.
I think this change makes sense. But we'll need to manually test it against each coverage workload to make sure it doesn't cause coverage files to be dropped. In the past we've had to do several workarounds, especially for CLN, to make sure there's a "clean exit" that allows coverage profiles to be saved to disk.
Also please make sure CI passes before requesting the next review. You should be able to run all the CI commands locally (e.g., cargo test, cargo fmt, cargo clippy --all-targets --all-features -- -D warnings).
Added a SAFETY comment here explaining that the pre_exec closure only calls async-signal-safe libc::setpgid and uses io::Error::last_os_error() for errno capture. Co-authored-by: Matt Morehouse <mattmorehouse@gmail.com>
|
Hey @morehouse, thanks for the review. I’ve updated the patch to address the feedback, switched the One question as I think about follow-up validation: for the manual coverage testing you mentioned, is there a target/workload you consider the most sensitive canary for shutdown-related coverage loss, or should I treat CLN as the primary one given its existing clean-exit handling? |
Summary
Teach ManagedProcess to place spawned children in their own process groups and terminate the full group during shutdown. This makes cleanup more reliable for subprocess trees instead of only signaling the direct child.
Changes
create a dedicated process group for each managed subprocess at spawn time
send SIGTERM and, if needed, SIGKILL to the process group during shutdown
keep a direct-child signal fallback if process-group signaling fails
add a regression test covering child + grandchild cleanup
add a small CLN note documenting graceful RPC stop first, then managed cleanup fallback
add tempfile as a test-only dependency for the new regression test
Why
This improves teardown semantics for multi-process workloads and is directly relevant to future campaign management work where reliable stop/cleanup behavior matters.
Testing
cargo test -p smite ✅
cargo test ✅
Kindly review @morehouse & let me know if any changes are needed