Pretty interesting that they only used about $15k worth of resources (retail price) to achieve this. It's not a technique that would have been out of reach for other organizations based only on not being able to afford the compute.
That’s only for the final model. To find it, they’d need to run 1,000 experiments, trying many high-level approaches, many architectures for each component, hyperparameter search, and multiple seeds. Large machine learning projects need $10M in capital.