Apply pulsar capability snapshots to BYOC destinations#29
Open
mvdbeek wants to merge 757 commits into
Open
Conversation
c7c9e22 to
b004283
Compare
Improves clarity and detail for quota availability changes by replacing a generic dictionary with a structured list of transfers. This provides explicit source and target object store IDs for each quota impact, making the information more precise for the user interface. It enhances the accuracy and reviewability of storage operation previews.
Clarifies user understanding of the implications when datasets move from private to shareable storage. The updated message is more informative about the potential for sharing after the operation, making the warning less ambiguous.
Creates a dedicated Vue component to display the storage operation preview. This component centralizes the complex rendering logic, improving maintainability and allowing for a clearer separation of concerns within the storage operation wizard.
Simplifies the storage operation wizard by offloading preview rendering to a dedicated component. This reduces the modal's complexity, removes duplicated logic, and improves the overall user experience with clearer instructions and titles.
Prevents dataset transfer attempts when the source data is unavailable by detecting missing files beforehand and recording a specific failure. Improves error reporting and robustness against purged or absent datasets. Co-authored-by: Copilot <copilot@github.com>
Introduces new helper methods to support preview, execute, and monitor bulk storage operations within test populators.
Introduces comprehensive integration tests covering preview, execution, and run status for bulk storage operations across distributed object store backends. Validates eligibility, error handling, warnings, run lifecycle, collection support, and edge cases including quota transfers and privacy warnings.
f954a15 to
2c5533d
Compare
Collaborator
|
Needs a rebase. |
Two new tables backing user-self-registered Pulsar compute resources
("Bring Your Own Compute"):
* pulsar_byoc_resource — one row per registered BYOC, holding the relay
endpoint, manager_name (= relay user sub claim), and lifecycle status
(pending/active/disabled/deleted). Refresh token lives in the Galaxy
vault under pulsar_byoc/<id>/relay_refresh_token, never on the row.
* pulsar_byoc_bootstrap_token — short-TTL single-use ticket authorising
the POST /api/pulsar_byoc/bootstrap callback. Synthetic int id PK +
unique-indexed token column (Galaxy convention).
Migration fe9dd3972993 chains from f5a73c8b9d12 and uses the
galaxy.model.migrations.util.create_table / drop_table helpers.
* enable_pulsar_byoc (bool, default false): gates the /api/pulsar_byoc endpoints and the multi-tenant pulsar_byoc job runner. * pulsar_byoc_relay_url (str, optional): operator-configured relay URL surfaced in the registration response and embedded in the one-liner the user pastes into ``pulsar-config register-with-galaxy``.
PulsarByocManager (lib/galaxy/managers/pulsar_byoc.py) owns the BYOC
domain logic — bootstrap-token lifecycle, JWT sub-claim verification,
DB state machine, vault writes, rate limiting (5 mints/hour rolling
window). Wired onto UniverseApplication as ``app.byoc_manager`` so TPV
rules can call ``app.byoc_manager.get_active_for(user)`` at job
dispatch time.
The HttpRelayClient adapter (lib/galaxy/managers/pulsar_byoc_relay.py)
isolates all relay HTTP traffic — refresh-token exchange, /auth/me, and
the three-topic ACL-pinning dance with create-or-verify-ownership on
conflict. Defined as a RelayClient Protocol + a default factory so the
manager can be unit-tested against an in-memory fake without monkey-
patching ``requests``.
Manager methods raise domain exceptions (BootstrapTokenInvalid,
BootstrapTokenExpired, RelayVerificationFailed, RegistrationRateLimited,
ResourceHasRunningJobs) translated to HTTP at the service boundary in a
follow-up commit.
Topic-pinning tests are contract tests of HttpRelayClient against
``responses``-stubbed relay endpoints — replacing the previous
suite that monkey-patched ``requests.{get,post}`` on a manager instance
built via ``object.__new__``.
One statically-registered runner serves every BYOC resource. At startup it holds no relay credentials and no client manager; those are materialised lazily — keyed by ``(relay_url, manager_name)`` and guarded by an RLock — when a job's destination params point at a user's BYOC resource. The lazy path reads the per-user relay refresh token from the vault, builds a Pulsar ClientManager via an injected factory (defaults to ``pulsar.client.build_client_manager``), and wires a rotation callback so refreshed tokens persist back into the vault. Recovery fails any in-flight job cleanly if its BYOC row has been deleted while the job was running. Renames the base class's ``__async_update`` to ``_async_update`` so the subclass can reference it without name-mangling — no behaviour change for existing runners. Tests inject a recording FakeClientManagerFactory and verify state on the resulting fakes (status-callback wiring, shutdown count) instead of asserting on ``MagicMock.assert_called_once`` interactions.
Six endpoints under /api/pulsar_byoc, all gated behind a
``Depends(_require_enabled)`` dependency that 404s when
``enable_pulsar_byoc`` is off:
POST /api/pulsar_byoc/registration — mint bootstrap ticket
POST /api/pulsar_byoc/bootstrap — host-side callback
GET /api/pulsar_byoc — list user's resources
GET /api/pulsar_byoc/{id} — resource detail
DELETE /api/pulsar_byoc/{id} — soft delete (disable)
POST /api/pulsar_byoc/{id}/purge — hard delete (drain + vault)
PulsarByocService owns the API-shaped glue: model→Pydantic mapping, the
``one_liner`` command-string formatting, config reads, and domain-
exception → HTTPException translation. The controller is pure
delegation. Pydantic schemas in lib/galaxy/schema/pulsar_byoc.py.
Routes are auto-registered via ``include_all_package_routers`` in
fast_app.py — no buildapp.py wiring needed.
A single admin-shipped TPV destination ``pulsar_byoc`` whose ``params`` are filled in at job dispatch time by a rule consulting ``app.byoc_manager.get_active_for(user)``. No per-BYOC YAML regeneration is required — the rule's f-string expressions resolve against the live manager when a job is mapped. The unit test exercises the full TPV→manager→rule path with a stub manager: a user with an active resource routes to ``pulsar_byoc`` with the correct params injected; users without one fall back to the default destination.
Brings up Keycloak (compose) + pulsar-relay (subprocess from local checkout) + Pulsar (subprocess from local checkout) + Galaxy (in-process via IntegrationTestCase). Skips cleanly when Docker isn't reachable or the pulsar-relay / pulsar source trees can't be located — override the search path with ``PULSAR_RELAY_REPO`` / ``PULSAR_REPO``. Two test files under the e2e marker: * test_byoc_e2e.py drives the full device-flow with pair=true against Keycloak, exercises HttpRelayClient.pin_topics_for_manager against the live relay, and verifies the two refresh tokens rotate on independent chains. Includes a cross-user defence test confirming the bootstrap admin can't seize a BYOC user's topics. * test_byoc_tool_execution.py is the heavyweight sibling: spawns a real Pulsar daemon against the relay using the device-flow primary refresh token, then submits a framework tool through Galaxy via TPV → pulsar_byoc → relay → Pulsar and asserts the tool completes ok with destination_params carrying the right resource id + manager name. RUNBOOK.md documents how a Docker-enabled session can run the full suite, what passes look like, and how to triage failures. README.md covers the simpler "what's in here" question. The ``e2e`` pytest marker is added to pytest.ini.
Adds RelayCapabilitiesCache + PulsarByocManager.capabilities_for, and hooks _apply_capability_downgrades into PulsarMQBYOCJobRunner so that when Galaxy builds a job for a BYOC pulsar destination it fetches the remote's capability snapshot from the relay (cached in-memory, 60s TTL) and adjusts the destination params to match what the remote can actually do: - Auto-fills jobs_directory from the snapshot's staging_directory when unset (or set to the destination_default sentinel). With BYOC there is no shared FS between Galaxy and pulsar, so these MUST agree — warns loudly on operator-supplied mismatches. - Clears docker_enabled / singularity_enabled / apptainer_enabled individually if the binary isn't on the remote's PATH. - Clears remote_container_handling if no container runtime is present. - Demotes dependency_resolution=remote to 'none' (NOT 'local') if the remote reports no conda — 'local' is meaningless without a shared FS. Downgrades are clear-only for boolean features; jobs_directory is the sole auto-fill (it's pure data-supply, not a downgrade). Pulsar publishes the snapshot once on startup to a per-pulsar relay topic (pulsar-side commit cae2241+006958a+2db26b7+c822898 on master); this PR consumes it through HttpRelayClient.fetch_messages added in pulsar-relay-client 0.2.2. Blocked on pulsar-relay-client 0.2.2 release (galaxyproject/pulsar-relay#10) for the SDK pin bump.
The previous pin matched no published version (PyPI has 0.2.x, not 1.x), so every Galaxy CI run on this branch failed at install time with ModuleNotFoundError on pulsar_relay_client. Replaces with a pin that resolves to a real release — 0.2.2 because that's the version adding HttpRelayClient.fetch_messages, which this PR depends on (galaxyproject/pulsar-relay#10). CI on this branch will stay red until 0.2.2 publishes to PyPI; after that it should go green without further changes here.
CI was still failing with ModuleNotFoundError on pulsar_relay_client because Galaxy's CI installs from pinned-requirements.txt (the lockfile), not from pyproject.toml directly — and the BYOC base branch never added the pin to the lock. Add pulsar-relay-client==0.2.2 (now published) right next to pulsar-galaxy-lib so it gets installed in the test env. Also fixes ruff UP035: ``Callable`` should come from collections.abc, not typing — Galaxy's ruff config flags this even on 3.10+ where the typing version is still valid.
Runs the three Galaxy CI gates locally and addresses each:
- ``tox -e mypy``:
* test_pulsar_capabilities_cache: loosen ``_msg`` payload to ``Any``
(the test deliberately passes a non-dict to verify the validator
rejects it); guard ``out["v"]`` access with a None check.
* test_pulsar_byoc_runner: explicitly type the ``params`` dicts and
the ``**kwargs`` snapshot-builder unpacks as ``dict[str, Any]`` so
mypy can satisfy the heterogeneous value types.
* runners/pulsar.py: ``# type: ignore[attr-defined]`` on
``self.app.byoc_manager`` — the runner is typed against the
broader ``GalaxyManagerApplication`` but the BYOC manager is set
on ``UniverseApplication``. The runner only ever dispatches
inside a UniverseApplication so the attribute is present.
* Drive-by: rename a shadowed ``client`` variable in the existing
``test_byoc_e2e.py`` to ``http_client`` so the with-block doesn't
rebind the outer ``HttpRelayClient`` name to ``httpx.Client``.
- ``tox -e lint``: already clean after the earlier ruff Callable fix.
- ``make update-client-api-schema``: the base BYOC PR added
``/api/pulsar_byoc/...`` endpoints to FastAPI but never regenerated
``client/src/api/schema/schema.ts``. Regenerated via
``openapi-typescript`` + Galaxy's prettier config; the diff is +453
lines, all of which describe the BYOC endpoints.
Galaxy's CI runs format as a separate gate from lint. Apply isort and black to bring the new BYOC capability files (and a few pre-existing BYOC files isort had been quietly unhappy with) up to the project's formatting standard.
The packages/app pinned-requirements.txt is a symlink back to lib/galaxy/dependencies/, so the lib-side pin already covers it. But when packages are tested in isolation (Test Galaxy packages), pip resolves deps from setup.cfg's install_requires — and that listed only pulsar-galaxy-lib, not pulsar-relay-client. Add it next to the existing pulsar dep so the isolated package install pulls in the relay client.
The Test Galaxy packages CI shard installs galaxy-app in isolation (only the install_requires from setup.cfg). tpv is a dev/test dep, not a runtime dep, so the package-isolated install doesn't have it — and the existing test_pulsar_byoc_tpv_integration imports tpv at module level, which crashes pytest collection. importorskip skips the whole module when tpv is missing, restoring collection without changing behavior in environments where tpv IS installed (lib/dev install).
The runner block had no uncommented fields, so YAML parsed
runners.pulsar_byoc as None — failing schema validation
('Value None is not a dict at /runners/pulsar_byoc') and breaking
13 packages-isolation tests in test_job_configuration.py.
…sing
The 2 BYOC e2e setup errors weren't 'docker not running' — Docker is
available on CI. The actual error in the relay_against_keycloak fixture
was ModuleNotFoundError on pulsar_relay because the *server* package
(distinct from the pulsar-relay-client SDK already pinned) wasn't
installed. The fixture launches it under uvicorn:
python -m uvicorn pulsar_relay.main:app
Add pulsar-relay>=0.2.0 to the test dependency-group so CI installs
both halves of the relay (server + client) and the e2e suite can boot
the server in-process.
Also add a module-level pytest.skip in the BYOC conftest gated on
importlib.util.find_spec('pulsar_relay') — for local dev runs where
a tester might not have the server package installed, the suite now
skips cleanly instead of erroring at fixture setup.
…service / API / tests
Renames the user-facing surface from "Pulsar BYOC" (an implementation
detail and project codename) to "compute resources" — the industry-standard
term used by Cromwell, Nextflow, Kubernetes, AWS, etc. The Pulsar daemon
remains the under-the-hood backend; the surface presented to users and
operators is neutral about it.
Scope:
* DB tables: pulsar_byoc_resource → compute_resource, pulsar_byoc_bootstrap_token
→ compute_resource_registration. Migration filename + content updated;
feature has not shipped, so the original migration is just renamed in place.
* ORM models: PulsarByocResource → ComputeResource, PulsarByocBootstrapToken
→ ComputeResourceRegistration.
* Manager module + class: lib/galaxy/managers/pulsar_byoc.py →
lib/galaxy/managers/compute_resources.py;
PulsarByocManager → ComputeResourceManager. Exposed as
app.compute_resource_manager (was app.byoc_manager) and declared on
MinimalManagerApp. Exceptions renamed: BootstrapTokenInvalid →
RegistrationTokenInvalid, BootstrapTokenExpired →
RegistrationTokenExpired, PulsarByocError → ComputeResourceError.
* Service module + class: pulsar_byoc.py → compute_resources.py;
PulsarByocService → ComputeResourceService.
* API: lib/galaxy/webapps/galaxy/api/pulsar_byoc.py → compute_resources.py;
router tags=["compute_resources"]; routes:
POST /api/compute_resources/registrations
POST /api/compute_resources/registrations/complete
GET /api/compute_resources
GET /api/compute_resources/{id}
DELETE /api/compute_resources/{id}
POST /api/compute_resources/{id}/purge
* Schema module: pulsar_byoc.py → compute_resources.py;
PulsarByocResourceSummary → ComputeResourceSummary;
BootstrapPayload → RegistrationCompletionPayload.
* Config options: enable_pulsar_byoc → enable_compute_resources;
pulsar_byoc_relay_url → compute_resource_relay_url.
* TPV destination param: pulsar_byoc_resource_id → compute_resource_id.
Sample tpv file renamed: tpv/byoc.yml.sample → tpv/compute_resources.yml.sample.
* Job runner: PulsarMQBYOCJobRunner class name preserved (it IS a
Pulsar MQ runner using the BYOC pattern); registered runner id is
now `compute_resource`. BYOCClientManagerRegistry →
ComputeResourceClientManagerRegistry.
* Vault path: pulsar_byoc/{id}/relay_refresh_token →
compute_resource/{id}/relay_refresh_token.
* Integration test dir + filenames: test/integration/pulsar_byoc/ →
test/integration/compute_resources/; test_byoc_* → test_compute_resource_*;
template files dropped the byoc_ prefix.
* Client TS schema regenerated via openapi-typescript.
Verified: tox -e lint, -e format, -e mypy all green; 90 BYOC-related
unit tests pass.
pulsar-relay 0.2.0 transitively required starlette<1.0.0 via prometheus-fastapi-instrumentator (every available version of that package pins starlette<1.0.0). Galaxy pins starlette==1.0.0 in pinned-requirements.txt, so uv could not resolve the test dep group on any CI shard — every Python test job died at install time with ``× No solution found when resolving dependencies``. pulsar-relay 0.2.1 makes prometheus-fastapi-instrumentator an optional dependency under a new ``[metrics]`` extra and guards the ``Instrumentator().instrument().expose()`` call in main.py with a try/except ImportError. Galaxy doesn't need the auto-exposed ``/metrics`` route — the relay's prometheus_client-based counters (used by the API code paths) still record without it — so installing plain ``pulsar-relay>=0.2.1`` is enough and lets the resolver succeed against starlette==1.0.0.
…runner Extended metadata has pulsar write the post-job model store on the remote host, which Galaxy then collects from the destination's staging directory. The compute-resource (multi-tenant pulsar) runner has no shared filesystem between Galaxy and the user's pulsar — staged outputs can't be read back via path-based access regardless of operator config — and the remote pulsar typically doesn't ship Galaxy's metadata writer either. The job would die opaquely on the remote. Refuse at submit time inside ``get_client_from_wrapper`` so the operator gets a clear, actionable error before any job-prep work happens. ``recover`` / ``stop`` aren't gated — those operate on already-submitted jobs that necessarily passed the check. Three new unit tests in test_compute_resource_runner.py cover the positive case (extended → raise) and the negative parametric case (directory / legacy / None all pass through).
…ient device flow Replaces two test-internal shortcuts with the production code paths they were standing in for: * The heavy e2e test was manually ``model.ComputeResource(...)`` + ``UserVaultWrapper.write_secret(...)`` to wire up the resource. The Galaxy-side ``ComputeResourceManager.complete_registration`` (token-exchange → sub-claim validation → topic pinning → DB insert → vault write) was therefore entirely uncovered by integration tests. Drop the shortcut and instead drive Galaxy's real bootstrap endpoints (``POST /api/compute_resources/registrations`` and ``/registrations/complete``) from ``setUp``. That's the same path ``pulsar-config register-with-galaxy`` exercises on the host side. * The RFC 8628 device-flow polling loop was reimplemented in ``_device_flow.py`` even though ``pulsar_relay_client`` ships ``RelayDeviceFlowAuthenticator`` for exactly this purpose, with an ``on_user_code`` hook designed for "tests / alternative UIs". Use the library helper directly; pass the Keycloak operator-login as the ``on_user_code`` callback. Removes ~130 lines of duplicated RFC-8628-spec implementation. ``_pre_create_topics`` stays in ``_prepare_galaxy`` to keep Pulsar's long-poll subscriptions from blocking before Galaxy boots — but ``complete_registration``'s own ``create_or_verify_topic`` loop is still exercised in ``setUp`` (the duplicate calls are idempotent under same-owner topic ownership).
The harness was rolling its own ports, docker-availability check, tempdir, compose orchestration, and lifecycle hooks. Each of those has a shared counterpart used elsewhere in the suite: * ``_free_port`` → ``galaxy.util.sockets.unused_port`` * ``_docker_running`` + ``_compose_cmd`` → ``@integration_util.skip_unless_docker()`` class decorator (same gating ``test_auth_oidc.py`` uses) * ``COMPUTE_RESOURCE_E2E_TMP`` env-var override + ad-hoc ``tempfile.mkdtemp`` → ``cls._test_driver.galaxy_test_tmp_dir`` (per-class, auto-cleaned by ``IntegrationTestCase.tearDownClass`` via ``cleanup_directory``) * Pointless empty ``_configure_app`` override → drop * Docker-compose-based Keycloak bring-up → ``docker run`` mirroring ``test/integration/oidc/test_auth_oidc.py:start_keycloak_docker``. The YAML+compose-shim machinery (``_compose_cmd`` variant detection, ``teardown_compose``, ``docker-compose.yml``, the ``compose_env`` field on ``KeycloakHandle``) is gone. Net -73 lines across the three touched files.
OIDC and compute_resources both stood up Keycloak via ``docker run`` with near-identical boilerplate but had drifted on image (``keycloak/keycloak:26.2`` vs ``quay.io/keycloak/keycloak:26.0``), env-var style (``KC_BOOTSTRAP_ADMIN_*`` vs deprecated ``KEYCLOAK_ADMIN*``), and ready-probe (``/.well-known/...`` vs ``/realms/master``). Lift the shared bits into a new module with two start flavours that share image + ready-probe + teardown: * ``start_keycloak_https_with_realm`` — production mode, TLS, realm imported from a mounted directory. Used by ``test_auth_oidc.py``. * ``start_keycloak_http_dev`` — dev mode, HTTP, no realm import; the consuming suite provisions the realm dynamically via the admin API. Used by the compute_resources harness. Both call ``wait_till_keycloak_ready`` (``/realms/master``, optional cert verification). Teardown is one ``stop_keycloak_docker`` for both. A bump of ``KEYCLOAK_IMAGE`` now touches both suites in one place; both move onto ``quay.io/keycloak/keycloak:26.2`` and the modern bootstrap env vars.
Lets CI exercise the pulsar-side compute_resources URL rename (PR galaxyproject#459) against this branch without cutting a pulsar release. Two pieces: * requirements.txt (pinned-requirements.txt): pin pulsar-galaxy-lib at the PR branch via a git URL instead of ==0.15.14. * common_startup.sh: export PULSAR_GALAXY_LIB=1 before the dependency install. pulsar's setup.py names the dist "pulsar-app" by default and only "pulsar-galaxy-lib" when this env var is set; without it the git source build produces a dist whose name doesn't match the requirement and the install fails ("Package metadata name `pulsar-app` does not match given name `pulsar-galaxy-lib`"). Harmless for the normal released-wheel path — the env var only affects building from source. REVERT before merge (restore the ==<version> pin once galaxyproject#459 is released).
2c5533d to
297437a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
PulsarMQBYOCJobRunnerbuilds a job for a BYOC pulsar destination, it now fetches the remote pulsar's capability snapshot from the relay (cached in-memory, 60s TTL) and adjusts the destination params to reflect what the remote can actually do:jobs_directoryauto-fill from the snapshot'sstaging_directorywhen unset (or set to thePARAMETER_SPECIFICATION_REQUIREDsentinel). With BYOC there is no shared FS between Galaxy and pulsar, so these must agree for path rewrites to work — operator-supplied mismatches log a loud warning.docker_enabled/singularity_enabled/apptainer_enabledindividually if the binary isn't on the remote's PATH; clearsremote_container_handlingif no container runtime at all.dependency_resolution=remoteto'none'(NOT'local') when the remote reports no conda.'local'would just shift the failure mode without fixing it because BYOC has no shared FS.Downgrades are clear-only for boolean features (never auto-enables what TPV didn't ask for);
jobs_directoryis the sole auto-fill since it's pure data-supply.Why
Pulsar publishes a static capability snapshot once on startup to a per-pulsar relay topic (pulsar-side commit shipped on master). Galaxy currently dispatches BYOC jobs blindly — if the operator's TPV rule sets
docker_enabled=truebut the remote pulsar has no docker installed, jobs fail at runtime with an opaque error. This PR turns that into a clean warning + graceful fallback.Architecture
RelayCapabilitiesCache(in-memory TTL, no DB) keyed by(relay_url, manager_name).PulsarByocManager.capabilities_for(resource, *, user)— wraps the cache, doing a refresh-token exchange +HttpRelayClient.fetch_messagesonly on cache miss; rotated tokens are vaulted back to the same key the runner reads from.PulsarMQBYOCJobRunner._apply_capability_downgrades(params, user)— called fromget_client_from_wrapperbefore super, mutatesdestination_paramsin place. The static__remote_container_handling/__dependency_resolutionhelpers then read the downgraded values when__prepare_jobruns.capabilities_forreturningNone(older pulsar version, network glitch, schema mismatch, missing vault token) is treated as "trust operator params verbatim" — capabilities are advisory.Dependencies
Blocked on
pulsar-relay-client0.2.2 release (galaxyproject/pulsar-relay#10) which addsHttpRelayClient.fetch_messages. The Galaxypyproject.tomlpin (pulsar-relay-client>=1.0,<2) is a pre-existing placeholder unrelated to this PR; it should be tightened to>=0.2.2whenever the BYOC branch fixes its pin to match reality.Test plan
test/unit/app/managers/test_pulsar_capabilities_cache.py(new, 19 tests): cache TTL semantics, payload validator, schema-version filter, topic-name convention.test/unit/app/managers/test_PulsarByocManager.py(extended, +7 tests):capabilities_forhappy path, anonymous, no-vault, empty topic, unknown schema, TTL caching, refresh-token rotation persistence.test/unit/app/jobs/test_pulsar_byoc_runner.py(extended, +18 tests):_apply_capability_downgradescovering every container-runtime path, dependency-resolution paths,jobs_directoryauto-fill / mismatch / sentinel cases, no-op on absent resource id / null snapshot.pulsar-relay-client>=0.2.2is published.Optional follow-up (not in this PR): extend
test/integration/pulsar_byoc/test_byoc_tool_execution.pyto exercise the path-rewrite-via-jobs_directoryend-to-end against a real pulsar.