Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool tool. I tried a few different things to get to work with google/gemini-2.5-pro, but couldn't figure it out.


    uv add google-genai
    uv run scripts/run_benchmarks.py --models google/gemini-2.5-pro --formats markdown_kv --limit 100
And add GOOGLE_API_KEY=<your-key-here> to a file called .env in the repo root.

Unfortunately I started getting "quota exceeded" almost immediately, but it did give 6/6 correct answers before it crapped out.


Thanks! That worked perfectly.

100 samples:

- gemini-2.5-pro: 100%

- gemini-2.5-flash: 97%




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: