Pretty interesting that they only used about $15k worth of resources (retail pri...

allenz · on Nov 30, 2020

That’s only for the final model. To find it, they’d need to run 1,000 experiments, trying many high-level approaches, many architectures for each component, hyperparameter search, and multiple seeds. Large machine learning projects need $10M in capital.

jeffbee · on Nov 30, 2020

I bet it's still a lot less than they spent training AlphaStar.

moritonal · on Nov 30, 2020

The tech might not be out of reach but the talent pool is.

Whether it's good PR or not is to be debated, but it seems that the talent at DeepMind simply can accomplish things other's can't.

mdjt · on Nov 30, 2020

Based on the going rate of a 32-core TPUv3 slice ($32/hr USD) running "for a few weeks", isn't this closer to $65k USD?

entropicdrifter · on Nov 30, 2020

One could buy 200 GPUs for cheaper, I think that's where the other comment's price estimate came from.

jeffbee · on Nov 30, 2020

It says $1,752/mo for v3-8, so I just multiplied it 8x.

mdjt · on Nov 30, 2020

Fair enough, that calculation is still a bit off if they used 128 cores (16x instead of 8x). Not that it really matters...

epsylon · on Nov 30, 2020

I'm pretty sure that this took more than 1 junior engineer-month.

ducttapecrown · on Nov 30, 2020

How much would the labor cost, though?