I was building a docker image for obsidian and noticed that the image is quite big. It is largely because of this
root@50f3333171b2:/usr/local/lib/python3.10/site-packages/nvidia# du -d 1 -h
61M ./cuda_nvrtc
253M ./cusparse
95M ./curand
1.2G ./cudnn
595M ./cublas
416K ./nvtx
186M ./cusolver
222M ./nccl
186M ./cufft
4.1M ./cuda_runtime
92M ./nvjitlink
44M ./cuda_cupti
2.8G .
As you can see, the whole Nvidia stack was pulled. The reason is that PyTorch with CUDA 12.6 is the default one for Linux, while on Mac and Windows, the CPU-only version is installed.
Here are some quick questions.
- Is GPU-acceleration important at all for our use case? I guess not.
- How many people use it on Linux? And how many of these Linux machines have a GPU?
Anyway, in the future, if we actually deploy the service, we will likely need to be more explicit here.
At this moment, I just force poetry to use the CPU one inside a Docker image.
A rigorous solution might be to use the CPU-only Torch by default and add the CUDA one as an extra.
I was building a docker image for obsidian and noticed that the image is quite big. It is largely because of this
As you can see, the whole Nvidia stack was pulled. The reason is that PyTorch with CUDA 12.6 is the default one for Linux, while on Mac and Windows, the CPU-only version is installed.
Here are some quick questions.
Anyway, in the future, if we actually deploy the service, we will likely need to be more explicit here.
At this moment, I just force poetry to use the CPU one inside a Docker image.
A rigorous solution might be to use the CPU-only Torch by default and add the CUDA one as an extra.