Skip to content

[Bug]: Helm: gpu-feature-discovery DaemonSet missing volumeMounts for nvidiaDriverRoot + custom env needed #1677

@plevart

Description

@plevart

Description:
In a Pure CDI architecture (deviceListStrategy: cdi-cri) without the nvidia-container-runtime wrapper, infrastructure pods like GFD are "blind" to NVIDIA libraries because they do not trigger CDI injection.

The Problem:
While the Helm chart provides the nvidiaDriverRoot variable to define the host path and creates the Volumes entry, it only adds the volumeMounts to the nvidia-device-plugin container. The gpu-feature-discovery container is missing these mounts.

Without the mount and a corresponding LD_PRELOAD to the mount path, GFD fails to load libnvidia-ml.so.1 on any system not using the legacy runtime wrapper.

Proposed Fix:
Sync the GFD template with the Device Plugin template to include:

volumeMounts: Mount nvidiaDriverRoot to a neutral path (e.g., /driver-root).

env: Add an optional custom env list so user can specify LD_PRELOAD pointing to the library within that mount to preload needed library(ies).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue/PR to expose/discuss/fix a bugneeds-triageissue or PR has not been assigned a priority-px label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions