Skip to content

fix: remove unnecessary Pool(1) wrapper in _VwPy.run()#81

Closed
JohnLangford wants to merge 1 commit intomasterfrom
fix/remove-pool1-wrapper
Closed

fix: remove unnecessary Pool(1) wrapper in _VwPy.run()#81
JohnLangford wants to merge 1 commit intomasterfrom
fix/remove-pool1-wrapper

Conversation

@JohnLangford
Copy link
Copy Markdown
Member

@JohnLangford JohnLangford commented Jan 19, 2026

Summary

Use spawn context for the Pool(1) in _VwPy.run() to enable compatibility with pybind11 bindings.

Problem

The Pool(1) wrapper is required because VW is not thread-safe internally - each VW call must run in its own subprocess for isolation.

However, the default fork-based Pool doesn't work with pybind11 bindings because pybind11 modules don't survive fork() cleanly, causing segfaults or hangs.

Solution

Use multiprocessing.get_context('spawn') to create a spawn-based Pool:

def run(self, args: str, filename=None) -> Iterable[str]:
    # Use spawn context to avoid fork issues with pybind11
    import multiprocessing
    ctx = multiprocessing.get_context('spawn')
    with ctx.Pool(1) as p:
        return p.apply(_run_pyvw, [args], {"filename": filename})

The spawn context starts fresh Python interpreters instead of forking, which is compatible with pybind11.

Testing

Verified locally that vw_executor works with pybind11-based vowpalwabbit bindings when called from Python scripts.

Note

Code that calls vw_executor at module import time (before if __name__ == '__main__' guard) may still have issues because spawn needs to re-import the main module in child processes. Such code should be restructured to defer VW execution to runtime.

Fixes #80

The Pool(1) wrapper is required because VW is not thread-safe internally.
Each VW call must run in its own subprocess for isolation.

However, the default fork-based Pool doesn't work with pybind11 bindings
because pybind11 modules don't survive fork() cleanly.

This change uses multiprocessing.get_context('spawn') to create a
spawn-based Pool, which starts fresh Python interpreters instead of
forking. This is compatible with pybind11.

Note: Code that calls vw_executor at module import time (before
if __name__ == '__main__' guard) may still have issues because spawn
needs to re-import the main module in child processes. Such code should
be restructured to defer VW execution.

Fixes #80
@JohnLangford JohnLangford force-pushed the fix/remove-pool1-wrapper branch from 145c57f to 72a394a Compare January 19, 2026 20:22
@JohnLangford JohnLangford deleted the fix/remove-pool1-wrapper branch January 19, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove unnecessary Pool(1) wrapper in _VwPy.run() for pybind11 compatibility

1 participant