One thing that stands out playing with the sorting is that Google's Gemini claim...

avereveard · on Dec 8, 2024

https://github.com/NVIDIA/RULER results in benchmark other than needle in haystack seem solid all the way to 128k

lolinder · on Dec 8, 2024

Thanks, this is exactly the kind of info I was hoping existed.

danpalmer · on Dec 8, 2024

When the released it they specifically focused on the accurate recall across the context window. There are a bunch of demos of things like giving it a whole movie as input (frame every N seconds plus script or something) and asking for highly specific facts).

Anecdotally, I use NotebookLM a bit, and while that’s probably RAG plus large contexts (to be clear, this is a guess not based on inside knowledge), it seems very accurate.

wtvanhest · on Dec 8, 2024

What tactics do you use to refresh while using them?

zoltrix303 · on Dec 8, 2024

I tend to use a sentence along these lines: "Give me a straightforward summary of what we discussed so far, someone who didn't read the above should understand the details. Don't be too verbose."

Then i just continue from there or simply use this as a seed in another fresh chat.

lolinder · on Dec 8, 2024

I don't have a strategy that I like—it just amounts to having to say "you forgot about requirement X, try again keeping that in mind".