fix: remove unnecessary Pool(1) wrapper in _VwPy.run()#81
Closed
JohnLangford wants to merge 1 commit intomasterfrom
Closed
fix: remove unnecessary Pool(1) wrapper in _VwPy.run()#81JohnLangford wants to merge 1 commit intomasterfrom
JohnLangford wants to merge 1 commit intomasterfrom
Conversation
The Pool(1) wrapper is required because VW is not thread-safe internally.
Each VW call must run in its own subprocess for isolation.
However, the default fork-based Pool doesn't work with pybind11 bindings
because pybind11 modules don't survive fork() cleanly.
This change uses multiprocessing.get_context('spawn') to create a
spawn-based Pool, which starts fresh Python interpreters instead of
forking. This is compatible with pybind11.
Note: Code that calls vw_executor at module import time (before
if __name__ == '__main__' guard) may still have issues because spawn
needs to re-import the main module in child processes. Such code should
be restructured to defer VW execution.
Fixes #80
145c57f to
72a394a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Use
spawncontext for thePool(1)in_VwPy.run()to enable compatibility with pybind11 bindings.Problem
The
Pool(1)wrapper is required because VW is not thread-safe internally - each VW call must run in its own subprocess for isolation.However, the default fork-based Pool doesn't work with pybind11 bindings because pybind11 modules don't survive
fork()cleanly, causing segfaults or hangs.Solution
Use
multiprocessing.get_context('spawn')to create a spawn-based Pool:The spawn context starts fresh Python interpreters instead of forking, which is compatible with pybind11.
Testing
Verified locally that vw_executor works with pybind11-based vowpalwabbit bindings when called from Python scripts.
Note
Code that calls vw_executor at module import time (before
if __name__ == '__main__'guard) may still have issues because spawn needs to re-import the main module in child processes. Such code should be restructured to defer VW execution to runtime.Fixes #80