Are there any good open source solutions that manages a cluster of ollama serve nodes and distributes 'chat' requests to nodes that are up and not currently processing 'chat' requests?
A http reverse proxy seems to be up to the job, with perhaps consul or redis as the queue and service discovery. Would prefer written in go or rust.