What are some good AI model monitoring tools used in production?

little_birdy23 · on Dec 5, 2023

We found aporia to be really useful for production observability and monitoring. Fits easily with our Vertex infrastructure, but I think they can be deployed all over.

uma08 · on Dec 5, 2023

What's their pricing like? They don't seem to display it on their website

little_birdy23 · on Dec 6, 2023

Yeah, I think you have to reach out to them for that. Depends on your needs, I guess.

verdverm · on Dec 5, 2023

In what sense? logical / input-output monitoring, or cpu/mem/net/usage metrics?

uma08 · on Dec 5, 2023

CPU/GPU/, Memory, multi-cloud, model degradation

verdverm · on Dec 5, 2023

I'd use the same things we use for all the other things we run in the cloud, nothing special about an AI model here. Probably holds true if you think about input/output monitoring too, it's going to be called through an API, and there are plenty of tools for monitoring that as well.

Ideally your cloud provider has an easy to use solution. If not, there are monitoring services you can pay for as well. Running the open source alternatives takes some expertise.

uma08 · on Dec 5, 2023

Help me understand this, I'm not sure aws, gcp or azure offer analytics around models degrading over time though? How can I track the drop in accuracy levels or problems of overfitting in prod?

verdverm · on Dec 5, 2023

That is more the API testing level, it's an application logic thing. In theory, you can output the values of interest into the logs and use log analysis. Or you could expose them as metrics like other values with https://openmetrics.io/