Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's true, and one can bias logits in llama.cpp and friends too, but those are global biases that affect the entire output rather than being specified per-token. Uploading a grammar or a wasm binary to the inference engine does seem more expressive.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: