Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yea agree.. This is the same discussion point that came up last time they had an incident.

I really don’t buy this requirement to always deploy state changes 100% globally immediately. Why can’t they just roll out to 1%, scaling to 100% over 5 minutes (configurable), with automated health checks and pauses? That will go along way towards reducing the impact of these regressions.

Then if they really think something is so critical that it goes everywhere immediately, then sure set the rollout to start at 100%.

Point is, design the rollout system to give you that flexibility. Routine/non-critical state changes should go through slower ramping rollouts.



Can't get hacked when you are down.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: