Which is the goto leaderboard for determining which AI model is best for for ans... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		lfmunoz4 on June 20, 2024 \| parent \| context \| favorite \| on: Claude 3.5 Sonnet Which is the goto leaderboard for determining which AI model is best for for answering devops / computer science questions / generating code? Wondering where Claude falls on this. Recently canceled openai subscription because too much lag and crashes. Switched to Gemini because their webinterface is faster and rock solid. Makes me think the openai backend and frontend engineers don't know what they are doing compared to the google engineers.

hackerlight on June 20, 2024 | [–]

chat.lmsys.org --> "Leaderboard" tab --> "Coding" drop-down selection

Or the scale.ai private benchmarks

espadrine on June 24, 2024 | [–]

One extensive benchmark I like is https://bigcode-bench.github.io/

It places Claude 3.5 Sonnet in third position.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact