both llama.cpp and vllm support inference with rocm or vulkan. inference is the ...

		buyucu 12 months ago \| parent \| context \| favorite \| on: An analysis of DeepSeek's R1-Zero and R1 both llama.cpp and vllm support inference with rocm or vulkan. inference is the easiest thing to decouple from nvidia.