We are a bare metal compute offering. They can be used for whatever people want ...

rvnx · on July 12, 2024

> We would love to offer hourly on-demand rates for individual GPUs, but we can't do so at this time due to a limitation in the ROCm/AMD drivers. This limitation prevents PCIe pass-through to a virtual machine, making multi-tenancy impossible. AMD is aware of this issue and has committed to resolving it.

One idea to help you: Are you sure you need a virtual machine ? Couldn't you boot the machines under PXE to solve the imaging problem ?

Essentially you have TFTP server that gives a Linux image and boot on it directly

latchkey · on July 12, 2024

1 chassis, 8 gpus.

We want to be able to break that chassis up into individual GPUs and allocate 1 GPU to 1 "machine". I previously PXE booted 20,000 individual playstation 5 diskless blades and I'm not sure how PXE would solve this.

The only alternative right now is to do what runpod (and AMD's aac) are doing and do docker containers. But that has the limitation of docker in docker, so people end up having to repackage everything. You also can't easily run different ROCm versions since that comes from the host, and if you have 8 people on a single chassis... it becomes a nightmare to manage it.

We're just patiently waiting for AMD to fix the problem.

rvnx · on July 12, 2024

Got it it's clear, I thought you had 1 GPU in one chassis in some cases.

latchkey · on July 13, 2024

No such thing with mi300x. They come 8 at a time on a OAM/UBB.