Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can you share what your experience with implementing IPoIB with used gear was? I'm asking mainly because I actually got interested recently with such setups however I got rather discouraged by the driver support.

As an example here is the driver page for Mellanox, now owned by Nvidia, since they are a major Infiniband equipment supplier: https://network.nvidia.com/products/infiniband-drivers/linux...

It seems that some decent support only exists for more recent generations. The older ones like ConnectX-3 or earlier, which typically show up on ebay are either not supported any more or maybe available for older kernel versions and soon to be EOLed.

So do I understand it correctly that to use such adapters one has to actually downgrade to an older kernel version?

Or is there some basic support in the latest Linux kernels for older generations still?



Yes, if you want to use the officially supported driver for ConnectX-3 (mlx4_xxx kernel modules, LTS release of v4.9 available from Nvidia's page), you need to go with something like Ubuntu 20.04 LTS (which should be good until at least end of 2025). However, the latest Mellanox drivers (mlx5_xxx kernel modules) work just fine with the ConnectX-3, at least for basic functionality.

I've not actually used IPoIB on such gear myself, but we have been working quite a bit on reusing old/ancient HPC clusters with IB adapters, and you can generally make things work if you spend enough time on trial and error and you are not afraid of compiling code with complicated dependencies. As long as you can get the IB stuff talking, and the driver is using OFED, the IPoIB part should Just Work.

It is always going to be an adventure working with used gear. But HPC has such a high decommissioning tempo and low resale value that there will always be quite a few other enthusiasts toying about.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: