I always find myself baffled by “prompt optimization” frameworks. Do people real...

popalchemist · on Feb 2, 2025

It's useful in enterprise scenarios where you need a reliable outcome for some kind of programmatic task, and you are dealing with throughput of jobs in the thousands to hundreds of thousands.

joehewett · on Feb 1, 2025

depends what you're doing. If you're using ChatGPT via the UI for a one off question, sure. If you're prompting an LLM that is doing a critical task in production millions of times, minor improvements can have significant benefit

rfw300 · on Feb 2, 2025

I have done the latter much more than the former. My experience has been the issues come from inputs that you don’t foresee, not reliability on in-distribution uses (which would be your “training” data for prompt optimization). And the worry is that this kind of optimization would lead to substantive revisions of the guidelines set out in the prompt, which could further compromise performance out of distribution.

To the extent that you need to eke out reliability on the margins, one is vastly better served by actual fine-tuning, which is available both for open-source models and most major proprietary models.