You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #1163 fixed generate_from_raw to propagate exceptions instead of silently swallowing them into empty ModelOutputThunk(value="") (issue #597). The fix uses asyncio.gather(*coroutines) (default return_exceptions=False).
Problem
With asyncio.gather(return_exceptions=False), when one coroutine raises, the other in-flight coroutines are not cancelled — they continue running on the event loop until completion, but their results are discarded. For N parallel Ollama HTTP requests, this means up to N−1 requests complete and their results are thrown away.
This is a pre-existing behaviour of asyncio.gather made visible by the fix in #1163. It was not introduced by that PR; under the old code all N tasks always ran to completion (whether they failed or not).
Desired behaviour
Python 3.12 (the project's minimum version) has asyncio.TaskGroup, which provides structured concurrency: if any task in the group raises, the remaining tasks are cancelled. This avoids wasted compute and network bandwidth.
Background
PR #1163 fixed
generate_from_rawto propagate exceptions instead of silently swallowing them into emptyModelOutputThunk(value="")(issue #597). The fix usesasyncio.gather(*coroutines)(defaultreturn_exceptions=False).Problem
With
asyncio.gather(return_exceptions=False), when one coroutine raises, the other in-flight coroutines are not cancelled — they continue running on the event loop until completion, but their results are discarded. For N parallel Ollama HTTP requests, this means up to N−1 requests complete and their results are thrown away.This is a pre-existing behaviour of
asyncio.gathermade visible by the fix in #1163. It was not introduced by that PR; under the old code all N tasks always ran to completion (whether they failed or not).Desired behaviour
Python 3.12 (the project's minimum version) has
asyncio.TaskGroup, which provides structured concurrency: if any task in the group raises, the remaining tasks are cancelled. This avoids wasted compute and network bandwidth.Caveats
TaskGroupraisesExceptionGroup(not the raw exception) on failure. Callers that doexcept ConnectionErrorwould need to useexcept*orExceptionGrouphandling. This is a semantic change on top of the change in fix: propagate generate_from_raw exceptions in OllamaModelBackend #1163.Related
gatherwithoutreturn_exceptions