Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>The key point here is that knowing that the thing in front of the robot is a coffee cup and a coffee machine and identifying how those things fit together and in what context that is required are all things that LLMs can do.

I'm skeptical that any LLM "knows" any such thing. It's a Chinese Room. It's got a probability map that connects the lexeme (to us) 'coffee machine' and 'coffee cup' depending on other inputs that we do not and cannot access, and spits out sentences or images that (often) look right, but that does not equate to any understanding of what it is doing.

As I was writing this, I took chat GPT-4 for a spin. When I ask it about an obscure but once-popular fantasy character from the 70s cold, it admits it doesn't know. But, if I ask it about that same character after first asking about some obscure fantasy RPG characters, it cheerfully confabulates an authoritative and wrong answer. As always, if it does this on topics where I am a domain expert, I consider it absolutely untrustworthy for any topics on which I am not a domain expert. That anyone treats it otherwise seems like a baffling new form of Gell-Mann amnesia.

And for the record, when I asked ChatGPT-4, cold, "What is Gell-Mann amnesia?" it gave a multi-paragraph, broadly accurate description, with the following first paragraph:

"The Gell-Mann amnesia effect is a term coined by physicist Murray Gell-Mann. It refers to the phenomenon where people, particularly those who are knowledgeable in a specific field, read or encounter inaccurate information in the media, but then forget or dismiss it when it pertains to other topics outside their area of expertise. The term highlights the paradox where readers recognize the flaws in reporting when it’s something they are familiar with, yet trust the same source on topics outside their knowledge, even though similar inaccuracies may be present."

Those who are familiar with the term have likely already spotted the problem: "a term coined by physicist Murray Gell-Mann". The term was coined by author Michael Crichton.[1] To paraphrase H.L. Mencken, for every moderately complex question, there is an LLM answer that is clear, simple, and wrong.

1. https://en.wikipedia.org/wiki/Michael_Crichton#Gell-Mann_amn...



Do we know how human understanding works? It could be just statistical mapping as you have framed it. You can’t say llms don’t understand when you don’t have a measurable definition for understanding.

Also, humans hallucinate/confabulate all the time. Llms even forget in the same way humans do (strong recall in the start and end of the text but weaker in the middle)


Hallucinations are a well known problem. And there are some mitigations that work pretty well. Mostly with enough context and prompt engineering, LLMs can be pretty reliable. And obscure popular fiction trivia is maybe not that relevant for every use case. Which would be robotics in this case; not the finer points of Michael Crighton related trivia.

You were testing its knowledge, not its ability to reason or classify things it sees. I asked the same question to perplexity.ai. If you use the free version, it uses less advanced LLMs but it compensates with prompt engineering and making it do a search to come up with this answer:

> The Gell-Mann Amnesia effect is a psychological phenomenon that describes people's tendency to trust media reports on unfamiliar topics despite recognizing inaccuracies in articles about subjects they know well. This effect, coined by novelist Michael Crichton, highlights a cognitive bias in how we consume news and information.

Sounds good to me. And it got me a nice reference to something called the portal wiki, and another one for the same wikipedia article you cited. And a few more references. And it goes on a bit to explain how it works. And I get your finer point here that I shouldn't believe everything I read. Luckily, my supervisor worked hard to train that out of me when I was doing a Ph. D. back in the day. But fair point and well made.

Anyway, this is a good example of how to mitigate hallucination with this specific question (and similar ones). Kind of the use case perplexity.ai was made to solve. I use it a lot. In my experience it does a great job figuring out the right references and extracting information from those. It can even address some fairly detailed questions. But especially on the freemium plan, you will run into limitations related to reasoning with what it extracts (you can pay them to use better models). And it helps to click on the links it provides to double check.

For things that involve reasoning (like coding), I use different tools. Different topic so won't bore you with that.

But what figure.ai is doing, falls well in the scope of several things openai does very well that you can use via their API. It's not going to be perfect for everything. But there probably is a lot that it nails without too much effort. I've done some things with their APIs that worked fairly well at least.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: