> For example, let's say in an eCommerce application that the shipping calculator is getting hit a lot. You'd like to be able to scale this independently as a service, so you can handle all the requests without also having to replicate all of the other resources, such as the cart persistence, user sessions, etc. that are a lot more memory intensive.
Assuming you allocate different resources for it. If you're using the same instance type for all your microservices, you aren't benefiting from this. In fact, you're paying for resources you aren't using.
You might even be paying more - allocating high memory instances to high memory services, and regular instances to low memory services that don't use that memory. You might be able to get by with only regular instances if you distributed your services in a monolithic fashion.
In my experience most small teams I've seen aren't that specific with their resources. Unless something is obviously super high memory, like a cache, they tend to just use default instances.
I have worked with a service where we saved some money by splitting them up and specialized the VM SKUs like that. But it's far from as trivial as GP implies. You have to plan out the interface, shake out any shared data (shared in-memory cache etc), design it to not be so chatty, and as with everything perf-related, do lots of testing. So it's not like a GGGP's solution that automatically spins up microservices could "just work".
This was a pretty high volume API at Azure, and the savings after all was said and done was sadly only around 15K/mo, so it'll take a while to pay for itself. So, I'd say for the average website this should not even factor into consideration.
(The other thing I forgot to mention was having a good consistent hashing strategy in the load balancer, so that the instances of each service consistently speak to the same instances of the other service, even as machines are spun up and down, but still maintains an even spread of load. This helps greatly in terms of allowing each instance to cache only the data for requests that target that instance. With a round-robin load balancer you end up with a lot more variation in your incoming requests, and thus require either more memory or expect more cache misses. That was probably the hardest part of the project, since the built-in load balancers don't have the specific functionality we needed).
Assuming you allocate different resources for it. If you're using the same instance type for all your microservices, you aren't benefiting from this. In fact, you're paying for resources you aren't using.
You might even be paying more - allocating high memory instances to high memory services, and regular instances to low memory services that don't use that memory. You might be able to get by with only regular instances if you distributed your services in a monolithic fashion.
In my experience most small teams I've seen aren't that specific with their resources. Unless something is obviously super high memory, like a cache, they tend to just use default instances.