We found aporia to be really useful for production observability and monitoring. Fits easily with our Vertex infrastructure, but I think they can be deployed all over.
I'd use the same things we use for all the other things we run in the cloud, nothing special about an AI model here. Probably holds true if you think about input/output monitoring too, it's going to be called through an API, and there are plenty of tools for monitoring that as well.
Ideally your cloud provider has an easy to use solution. If not, there are monitoring services you can pay for as well. Running the open source alternatives takes some expertise.
Help me understand this, I'm not sure aws, gcp or azure offer analytics around models degrading over time though? How can I track the drop in accuracy levels or problems of overfitting in prod?
That is more the API testing level, it's an application logic thing. In theory, you can output the values of interest into the logs and use log analysis. Or you could expose them as metrics like other values with https://openmetrics.io/