Assume it's per invocation, and each invocation generates a few dozen line function. How many such functions do you write when you get in the groove? If you multiply it out, you'll probably end up expecting a few bugs a day at 95%, similar to most humans might write.
Except you're pretty used to the sorts of bugs you write, and the AI isn't you. So these bugs will be harder to find.
So why is this better than writing by hand? Most of the hard work of programming is figuring out specs and debugging, not banging out well understood and specced implementations.
Really? What exactly is the bar then? I'd say most professionals I know hover somewhere between 95% and 99%.