scbenet's comments

scbenet · 2025-05-06T17:06:50 1746551210

It's very good at Go, which makes sense because I'm assuming it's trained on a lot of Google's code

simianwords · 2025-05-06T18:14:18 1746555258

How would they train it on google code without revealing internal IP?

scbenet · 2025-05-07T15:13:11 1746630791

Google has 2.8k public repositories on just their main github account (github.com/google).

Even if they're not training on their closed source internal codebases (which they most certainly are, but fair point that they're probably not releasing the models trained on that data), it definitely seems like they have a lot of Go code in the training data.

simianwords · 2025-05-08T05:06:17 1746680777

but so do their competitors?

scbenet · 2025-04-03T01:35:21 1743644121

Big fan of pico.sh, been hosting a few small sites on there for a while now, no faster way to get something up and running

scbenet · 2025-04-02T15:08:06 1743606486

Is anyone else unable to access stack overflow? I'm getting a black page and an HTTP 418 "I'm a teapot" response code

scbenet · on Dec 3, 2024

Technical report is available here https://www.amazon.science/publications/the-amazon-nova-fami...

kajecounterhack · on Dec 3, 2024

TL;DR comparison of models vs frontier models on public benchmarks here https://imgur.com/a/CKMIhmm

SparkyMcUnicorn · on Dec 3, 2024

This doesn't include all the benchmarks.

The one that really stands out is GroundUI-1K, where it beats the competition by 46%.

Nova Pro looks like it could be a SOTA-comparable model at a lower price point.

maeil · on Dec 3, 2024

Just means it's better at one specific task than the others, which has always been the case. For each of Sonnet, GPT and Gemini I can readily name a task they are individually the best at. At the same time the consensus that Sonnet 3.5 is overall the currently strongest model remains correct, and that's what most people care about. Additionally most people do tasks that all of the models perform similarly at, or they can't be bothered to optimize every task by using the best model for that one task. Which makes sense since not a single cloud provider has all three of them. Now this one will likely be AWS-exclusive too.

int_19h · on Dec 4, 2024

Benchmarks are way too easy to game. There's no shortage of models that "beat GPT-4" according to some benchmark or another, that are obviously nowhere even close when you try them on novel tasks.

attentive · on Dec 4, 2024

on https://aider.chat/docs/leaderboards/ Nova Pro is on par with Yi Coder 9B Chat. Which is not very inspiring.

retinaros · on Dec 3, 2024

in the berkeley function calling it is similar than 4-o for multi turn while being way faster

oblio · on Dec 3, 2024

SOTA?

camel_Snake · on Dec 3, 2024

"State of the Art", if that's what you were asking.

brokensegue · on Dec 3, 2024

So looks like they are trying to win on speed over raw metric performance

azinman2 · on Dec 3, 2024

Either that, or that’s just where they landed.

scbenet · on Nov 20, 2024

Can't read the article due to paywall, but potentially ambergris? It's a form of whale excrement that washes up on the shore and can sell for ~$10k USD/pound

https://en.wikipedia.org/wiki/Ambergris

scbenet · on Aug 29, 2024

May be resolved, or just spotty? I was able to load the page itself but tweets and any trending information were all out, seeing a lot of similar reports