HN Reader

New Top Best Ask Show Job

Running load using official Nvidia PyTorch image boost performance by 50%

2

2

3 months agoby riomus

© 2024 wagao

Note that the NVIDIA container uses CUDA+cuBLAS 13.0.2 which cites "Improved performance on NVIDIA DGX Spark for FP16/BF16 and FP8 GEMMs", which seems to be your use-case. In general, I would suspect that it mostly comes to versions of the libs.

Interestingly, there is a cuBLAS 13.1 whl on PyPI, not sure what that does.

3 months agoby t-vi