Information about the way we interact with the data (RLHF) can be used to refine...

Aurornis · 2025-12-15T01:50:13 1765763413

That’s a training step. It requires explicitly collecting the data and using it in the training process.

Merely using an LLM for inference does not train it on the prompts and data, as many incorrectly assume. There is a surprising lack of understanding of this separation even on technical forums like HN.

lwhi · 2025-12-15T11:11:12 1765797072

That's definitely a fair point.

However, let's say I record human interactions with my app; for example when a user accepts or rejects an AI sythesised answer.

This data can be used by me, to influence the behaviour of an LLM via RAG or by altering application behaviour.

It's not going to change the weighting of the model, but it would influence its behaviour.