Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My use case has been trying to remove the damn "apologies for this" and extraneous language that just waste tokens for no reason. GPT has always always always been so quick to waffle.

And removing the chat interface as much as possible. Many benchmarks are better with text completion models, but they keep insisting on this horrible interface for their models.

Fine tuning is there to ensure you get the output format you want without the extra garbage. I swear they have tuned their models to waste tokens.



The jargon to google here is "length bias"

It turns out if you generate two LLM responses and ask a judge to choose which is better, many judges have a bias in favour of long answers full of waffle.


Thanks for that pointer.

The abstract of this paper seems interesting: https://arxiv.org/html/2407.01085v3

> use of [LLMs] as judges [..] reveals a notable bias towards longer responses, undermining the reliability of such evaluations. To better understand such bias, we propose to decompose the preference evaluation metric, specifically the win rate, into two key components: desirability and information mass [..]

(If you're interested, give it a click. I tried to pare this down to avoid quoting a wall of text.)


> I swear they have tuned their models to waste tokens.

Which seems a bit weird, because the customers of the chat interface (ie non-API customers) don't pay per token.


I've heard the theory a few times lately that AI businesses will increasingly move towards usage models over subscription models, so while it is probably accidental, it could also be a longer term strategy to normalize excessive token usage.


I don't know whether the major AI companies will move to usage models. But let's assume that they do.

However: I would expect chat interfaces to be charged per query, not per token. End users don't understand tokens, and don't want to have to understand tokens.

If you charge per query, you don't gain anything from extra wordy responses.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: