We are a bare metal compute offering. They can be used for whatever people want to use them for (within legal limits, of course). Interest is much higher now that we've started to work with teams publishing benchmarks which show that H100's have a nice competitor [0].
I'll admit, it is still early days. We just finished up another free compute [1] two week stint with a benchmarking team. One thing we discovered is that saving checkpoints is slow AF. I'm guessing an issue with ROCm. Hopefully get that resolved soon. Now we are in the process of onboarding the next team.
> We would love to offer hourly on-demand rates for individual GPUs, but we can't do so at this time due to a limitation in the ROCm/AMD drivers. This limitation prevents PCIe pass-through to a virtual machine, making multi-tenancy impossible. AMD is aware of this issue and has committed to resolving it.
One idea to help you: Are you sure you need a virtual machine ? Couldn't you boot the machines under PXE to solve the imaging problem ?
Essentially you have TFTP server that gives a Linux image and boot on it directly
We want to be able to break that chassis up into individual GPUs and allocate 1 GPU to 1 "machine". I previously PXE booted 20,000 individual playstation 5 diskless blades and I'm not sure how PXE would solve this.
The only alternative right now is to do what runpod (and AMD's aac) are doing and do docker containers. But that has the limitation of docker in docker, so people end up having to repackage everything. You also can't easily run different ROCm versions since that comes from the host, and if you have 8 people on a single chassis... it becomes a nightmare to manage it.
We're just patiently waiting for AMD to fix the problem.
I'll admit, it is still early days. We just finished up another free compute [1] two week stint with a benchmarking team. One thing we discovered is that saving checkpoints is slow AF. I'm guessing an issue with ROCm. Hopefully get that resolved soon. Now we are in the process of onboarding the next team.