Feat/gpu augmentations by etienne87 · Pull Request #15 · MIC-DKFZ/batchgeneratorsv2

etienne87 · 2025-11-17T16:52:22Z

In an effort for GPU Augmentation (See Issue), i fix a few transforms to take into account the device of the image argument.

FabianIsensee · 2026-03-09T14:41:39Z

Thanks for the PR. I am wondering a bit when you might want to use GPU augmentation. We have so far not found it useful and prefer using the GPU for the actual training workload.
Changing just a couple of transforms to be GPU compatible might be of limited usefulness. Are you planning to extend this work to cover all transforms? How would you then treat transforms that have dedicated numpy/scipy paths because those turned out to be faster in my CPU-focused testing

etienne87 · 2026-03-09T15:12:29Z

Hello Fabian, thanks for the feedback!

GPU augmentation can be useful when server CPU resources are limited and volumes are large — the throughput gain depends heavily on the hardware setup. In my own testing, when the server is lightly loaded, CPU augmentation is equivalent or faster. The benefit appears when the server is under load — GPU augmentation remains fast while CPU throughput degrades due to contention.
Also some users want to do "data-augmentation-on-the-fly" once the batch is already loaded on GPU. For instance training for equivariance.

I focused initially on the transforms used in the classic 3d_fullres configuration as a test case. Extending to all transforms is a natural next step.

Regarding the numpy/scipy paths — could you point me to specific cases where those outperformed the GPU? For transforms like grid_sample, the CPU PyTorch path is serial across voxels (see GridSample.cpp .

FabianIsensee · 2026-03-10T07:55:29Z

Hey etienne, when mentioned that scipy/numpy paths are faster this was in the context of CPU data augmentation, not GPU. I agree the GPU augmentation is always faster, but I have yet to see a convincing application of that. In my understanding, it is better to configure servers properly and make adjustments to the data augmentation pipeline (such as switching to nearest neighbor resampling for segmentation) where needed rather than spending precious GPU time on data augmentation. CPU is cheap in comparison.
If we allow for GPU augmentation in batchgenerators (and I don't see why we shouldn't make this possible) then I would like to make sure we are not changing any of the CPU compute paths and keep everything backwards compatible. The current PR also has a bug, see my comments above

eperot added 4 commits November 13, 2025 16:50

several modifs with to(device) calls

b3c019c

put gaussian to gpu

b889ea5

fix gaussian blur for gpu computing

d51507b

make bilinear case work

a6c427f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/gpu augmentations#15

Feat/gpu augmentations#15
etienne87 wants to merge 4 commits intoMIC-DKFZ:masterfrom
etienne87:feat/gpu_augmentations

etienne87 commented Nov 17, 2025

Uh oh!

FabianIsensee commented Mar 9, 2026

Uh oh!

etienne87 commented Mar 9, 2026 •

edited

Loading

Uh oh!

FabianIsensee commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

etienne87 commented Nov 17, 2025

Uh oh!

FabianIsensee commented Mar 9, 2026

Uh oh!

etienne87 commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FabianIsensee commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

etienne87 commented Mar 9, 2026 •

edited

Loading