This is completely true but completely in conflict with how many very large companies advertise it. I’m a paid GitHub Copilot user and recently started using their chat tool. It lies constantly and convincingly, so often that I’m starting to wonder if it wastes more time than it saves. It’s simply not capable of reliably doing its job. This is on a “Tesla autopilot” level of misrepresenting a product but on a larger scale. I hope it continues being little more than a benign or embarrassing time-waster.
One of the only text written in chatgpt's own website that comes each time when you chat with it is "ChatGPT may produce inaccurate information about people, places, or facts."
“Context aware conversations with your copilot. If you're stuck solving a problem, ask GitHub Copilot to explain a piece of code. Bump into an error? Have GitHub Copilot fix it. It’ll even generate unit tests so you can get back to building what’s next.”
This is almost a Homer Simpson running for garbage commissioner level of over-promising. I think Copilot is an incredible tool, what’s possible right now is amazing and it can save time and offer value. But the degree to which it doesn’t just fail but completely misdirects is at serious odds with the breathless marketing.
0) It calculates on data YOU SUPPLY. If the data is incomplete or incorrect, it tries its best to fill in blanks with plausible, but fabricated, data. You MAY NOT ask it an open ended or non-hypothetical question that require grounding beyond included in the input.
e.g. “given following sentence, respond with the best summarization:, <string>” is okay; “what is a sponge cake” is not.
By that measure of intelligence even most humans, at some times, fail. Our brains misremember constantly, filling in details where information is lacking. One classic example are things like accidents and disasters. Accounts between people conflict, memories presenting events in an order that does not match that another’s memories, our outright fabrications. Dig up research on saccades and how our visual system does this on a constant basis, and can often be fooled as a result.
If knowing which blanks to fill in is a necessary condition of intelligence then all of humanity fails to measure up.
My point here is that very little is simple and straightforward. The concepts we use defy easy definitions. Our application of those concepts to artificial systems will inevitably do the same as a result.
I don't think it makes sense to call ChatGPT hallucinating when it returns wrong facts. Hallucinations imply that the protagonist can distinguish reality from something hallucinated. ChatGPT cannot distinguish facts from fiction.
1) ChatGPT is not a research tool
2) It sort of resembles one and will absolutely act like one if you ask it to, and it it may even produce useful results! But…
3) You have to independently verify any factual statement it makes and also
4) In my experience the longer the chat session, the more likely it is to hallucinate, reiterate, and double down on previous output