That's true, and one can bias logits in llama.cpp and friends too, but those are...

That's true, and one can bias logits in llama.cpp and friends too, but those are global biases that affect the entire output rather than being specified per-token. Uploading a grammar or a wasm binary to the inference engine does seem more expressive.