In order to receive any of the Financial Credits described above, Customer must notify Google technical support within thirty days from the time Customer becomes eligible to receive a Financial Credit. Customer must also provide Google with server log files showing loss of external connectivity errors and the date and time those errors occurred. If Customer does not comply with these requirements, Customer will forfeit its right to receive a Financial Credit. If a dispute arises with respect to this SLA, Google will make a determination in good faith based on its system logs, monitoring reports, configuration records, and other available information, which Google will make available for auditing by Customer at Customer’s request."
I would pay a premium for a cloud provider happy to give 100 percent discount for the month for 10 minutes downtime, and 100 percent discount for the year for an hour's downtime.
Any cloud provider offering those terms would go out of business VERY quickly.
Outages happen, all providers are incentivized to minimize the frequency and severity of disruptions - not just from the financial hit of breaching SLA (which for something like this will be significant), but for the reputational damage which can be even more impactful.
How often does amazon or google go down for ten minutes?
But let's work backwards from the goal instead.
If you charge twice as much, and then 20-30% of months are refunded by the SLA, you make more money and you have a much stronger motivation to spend some of that cash on luxurious safety margins and double-extra redundancy.
So what thresholds would get us to that level of refunding?
I think you're proving the parent comment's point. The number of businesses willing to pay a 500x markup is exceedingly small (potentially less than 1), and at that point the cost is high enough where it's probably cheaper to just build the redundancy yourself using multiple cloud providers (and, to emphasize, that option tends to be horribly expensive).
And all cloud providers will emphasize how you yourself should design your software and architect your infrastructure to be available in multiple regions to achieve the highest availability.
Just take the premium that you'd be willing to pay and put it in the bank -- the premium would be priced such that the expected payout of the premium would be less than or equal to what you'd be paying.
Besides, a provider credit is the least of most company's concerns after an extended outage, it's a small fraction of their remediation costs and loss of customer goodwill.
Just take the premium that
you'd be willing to pay and
put it in the bank
In my country, when companies are hired to do overnight rail maintenance, they face very stiff fines if they over-run and delay trains the next morning.
The fines are large enough that (for example) companies will have a heavy plant mechanic on site who does nothing on the vast majority of jobs - they're just standing by, to mitigate the risk of a breakdown leading to such a fine. Some business analyst with a spreadsheet has worked out the heavy plant breakdown rate, the typical resulting delays, the expected fines, and the cost of having the mechanic on standby... and they've worked out it's a good business decision.
The purpose of having an SLA isn't to get yourself money when your provider fails. The purpose is to make costly risk mitigation a rational investment for your suppliers.
> I would pay a premium for a cloud provider happy to give 100 percent discount for the month for 10 minutes downtime, and 100 percent discount for the year for an hour's downtime.
It takes a lot of effort (exponential) to reliably (I. E. Designed to fail-working) build something that is guaranteed to have this level of uptime at these penalties.
So I'm sure that I can build something that works like this, but would you pay me $100 per GB of storage per month? $100 per wall-time hour of CPU usage? $100 per GB of Ram used per hour? Because these are the premium prices for your specs.