Yessss! This is what I want. If there is a natural set of filters that can be applied, let me speak it in natural language, then the LLM can translate that as good as possible and then I can review it. E.g. searching photos between X and Y date, containing human Z, at location W. These are all filters that can be presented as separate UI elements so I can confirm the LLM interpreted correctly and I can adjust the dates or what have you without having to repeat the whole sentence again.
Also, any additional LLM magic would be a separate layer with its own context, safely abstracted beneath the filter/search language. Not a post-processing step by some kind of LLM-shell.
For example, "Find me all pictures since Tuesday with pets" might become:
Then the implementation of "fuzzy-content" would generate a text-description of the photo and some other LLM-thingy does the hidden document-building like:
Description: "black dog catching a frisbee"
Does that "with pets"?
Answer Yes or No.
Yes.