Skip to content

Regression: Pod volume backup hosting pods don't inherit arbitrary labels from node-agent (v1.17.0) #9435

@testsabirweb

Description

@testsabirweb

What steps did you take and what happened:

  1. Deployed Velero v1.17.0 in a Kubernetes cluster that requires a proxy to reach external services (AWS S3)
  2. Configured node-agent DaemonSet with label spectrocloud.com/connection: proxy (required by a mutating webhook that injects proxy environment variables)
  3. Created a backup that includes pod volumes
  4. Pod volume backup hosting pods are created without the spectrocloud.com/connection: proxy label
  5. Mutating webhook doesn't inject proxy environment variables because the label is missing
  6. Backup operations fail with dial tcp [IP]:443: i/o timeout because hosting pods cannot reach S3 through the proxy

What did you expect to happen:

Hosting pods created for pod volume backups should inherit all labels from the node-agent DaemonSet (excluding Velero internal labels like velero.io/*), similar to how backups worked in v1.16.2 when they ran directly in node-agent pods.

Root Cause:

In v1.17.0, Velero introduced a micro-service architecture where pod volume backups run in separate hosting pods instead of directly in node-agent pods. The code in pkg/controller/pod_volume_backup_controller.go:798-809 only copies labels from a hardcoded whitelist (util.ThirdPartyLabels), which currently only includes azure.workload.identity/use. Arbitrary labels needed for cluster policies are not inherited.

The same limitation exists in:

  • pkg/controller/pod_volume_restore_controller.go:865-874
  • pkg/controller/data_download_controller.go:862-871
  • pkg/controller/data_upload_controller.go:939+

Code References:

  • Label copying logic: pkg/controller/pod_volume_backup_controller.go:798-809
  • Hardcoded whitelist: pkg/util/third_party.go:19-21

Environment:

  • Velero version: v1.17.0
  • Kubernetes version: v1.32.5
  • Cloud provider: AWS S3 (behind proxy)
  • Issue: Regression from v1.16.2 (worked when backups ran in node-agent pods)

Impact:

This affects any cluster using:

  • Mutating webhooks that match pods by labels
  • Admission controllers that require specific labels
  • Network policies that select pods by labels
  • Any cluster policy that depends on labels from node-agent DaemonSet

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions