Conversation
|
Thanks for the PR. I am wondering a bit when you might want to use GPU augmentation. We have so far not found it useful and prefer using the GPU for the actual training workload. |
|
Hello Fabian, thanks for the feedback! GPU augmentation can be useful when server CPU resources are limited and volumes are large — the throughput gain depends heavily on the hardware setup. In my own testing, when the server is lightly loaded, CPU augmentation is equivalent or faster. The benefit appears when the server is under load — GPU augmentation remains fast while CPU throughput degrades due to contention. I focused initially on the transforms used in the classic 3d_fullres configuration as a test case. Extending to all transforms is a natural next step. Regarding the numpy/scipy paths — could you point me to specific cases where those outperformed the GPU? For transforms like grid_sample, the CPU PyTorch path is serial across voxels (see GridSample.cpp . |
|
Hey etienne, when mentioned that scipy/numpy paths are faster this was in the context of CPU data augmentation, not GPU. I agree the GPU augmentation is always faster, but I have yet to see a convincing application of that. In my understanding, it is better to configure servers properly and make adjustments to the data augmentation pipeline (such as switching to nearest neighbor resampling for segmentation) where needed rather than spending precious GPU time on data augmentation. CPU is cheap in comparison. |
In an effort for GPU Augmentation (See Issue), i fix a few transforms to take into account the device of the image argument.