Background
pyproject.toml currently caps:
dask>=2021.5.0,<2023.0
distributed>=2021.6.2,<2023.0
tornado>=6.1,<7
We are effectively frozen on dask/distributed ~2022.12.1 — roughly three years behind upstream. This freeze just produced a real bug: #388, where custom_as_completed subclassed as_completed and reimplemented a dask internal method (track_future) by copying its body. dask later renamed that method to _track_future (the tornado→asyncio migration), the override silently became dead code, and errored futures leaked into the run loop as raw (type, exc, traceback) tuples — crashing fits with AttributeError: 'tuple' object has no attribute 'score'.
That specific coupling is now removed (PR for #388 + follow-up cleanup), but the frozen pin itself is unaddressed technical debt. The longer we stay pinned, the larger and riskier the eventual jump (Python-version compatibility, security fixes, and API drift all accumulate).
Why this is its own effort (not part of the #388 fix)
Un-pinning is a substantial, risky change touching the distributed-execution core:
- The tornado coupling (
tornado<7) likely needs to go; modern dask is asyncio-native.
as_completed semantics, Client/Future behavior, scatter/gather, and serialization have all evolved across ~10 releases.
cluster.py (SSH/paramiko cluster bring-up) and the scheduler/worker lifecycle need revalidation against a current distributed.
- Needs validation across the cluster/HPC execution paths, not just local runs.
Scope / acceptance criteria
Reference
See #388 for the failure this freeze caused and the post-mortem reasoning.
Background
pyproject.tomlcurrently caps:We are effectively frozen on dask/distributed ~2022.12.1 — roughly three years behind upstream. This freeze just produced a real bug: #388, where
custom_as_completedsubclassedas_completedand reimplemented a dask internal method (track_future) by copying its body. dask later renamed that method to_track_future(the tornado→asyncio migration), the override silently became dead code, and errored futures leaked into the run loop as raw(type, exc, traceback)tuples — crashing fits withAttributeError: 'tuple' object has no attribute 'score'.That specific coupling is now removed (PR for #388 + follow-up cleanup), but the frozen pin itself is unaddressed technical debt. The longer we stay pinned, the larger and riskier the eventual jump (Python-version compatibility, security fixes, and API drift all accumulate).
Why this is its own effort (not part of the #388 fix)
Un-pinning is a substantial, risky change touching the distributed-execution core:
tornado<7) likely needs to go; modern dask is asyncio-native.as_completedsemantics,Client/Futurebehavior, scatter/gather, and serialization have all evolved across ~10 releases.cluster.py(SSH/paramikocluster bring-up) and the scheduler/worker lifecycle need revalidation against a currentdistributed.Scope / acceptance criteria
dask/distributedto a current release; drop or update thetornado<7pin accordingly.pre-commitpre-push hook (pytest-bngsimin.pre-commit-config.yaml, the same gate that ran on the Scatter search crashes with "'tuple' object has no attribute 'score'" when a fit evaluation fails #388 PRs). Two gaps to close there: (a) the hook currently exercises only the pinned dask, so add a step that installs/runs against a newer dask; and (b) the Scatter search crashes with "'tuple' object has no attribute 'score'" when a fit evaluation fails #388 contract test (tests/test_failed_sim_handling.py, which pins theas_completed(with_results=True, raise_errors=False)behavior we now depend on) isn't in the hook's file list yet — add it so a future dask bump that breaks that contract fails atgit push.Reference
See #388 for the failure this freeze caused and the post-mortem reasoning.