Hacker Newsnew | past | comments | ask | show | jobs | submit | oo0shiny's commentslogin

> My former colleague Rebecca Parsons, has been saying for a long time that hallucinations aren’t a bug of LLMs, they are a feature. Indeed they are the feature. All an LLM does is produce hallucinations, it’s just that we find some of them useful.

What a great way of framing it. I've been trying to explain this to people, but this is a succinct version of what I was stumbling to convey.


I have been explaining this to friends and family by comparing LLMs to actors. They deliver a performance in-character, and are only factual if it happens to make the performance better.

https://jstrieb.github.io/posts/llm-thespians/


The analogy goes down the drain when a criterion for good performance is being objectively right. Like with Reinforcement Learning from Verifiable Rewards.


RLVR can also encourage hallucinations quite easily. Think of SAT: giving a random answer is right 20% of the time, giving "I don't know" is right 0% of the time. If you only reward for test score, you encourage guesswork. So good RL reward design is as important as ever.

That being said, there are methods to train LLMs against hallucinations, and they do improve hallucination-avoidance. But anti-hallucination capabilities are fragile and do not fully generalize. There's no (known) way to train full awareness of its own capabilities into an LLM.


I think what you say is true, and I think that this is exactly true for humans as well. There is no known way to completely eliminate unintentional bullshit coming from a human’s mouth. We have many techniques for reducing it, including critical thinking, but we are all susceptible to it and I imagine we do it many times a day without too much concern.

We need to make these models much much better, but it’s going to be quite difficult to reduce the levels to even human levels. And the BS will always be there with us. I suppose BS is the natural side effect of any complex system, artificial or biological, that tries to navigate the problem space of reality and speak on it. These systems, sometimes called “minds”, are going to produce things that sound right but just are not true.


It's a feeling I can't escape: that by trying to build thinking machines, we glimpse more and more of how the human mind works, and why it works the way it does - imperfections and all.

"Critical thinking" and "scientific method" feel quite similar to the "let's think step by step" prompt for the early LLMs. More elaborate directions, compensating for the more subtle flaws of a more capable mind.


Nobody that I'd be using this analogy with is currently using LLMs for tasks that are covered by RLVF. They're asking models for factual information about the real world (Google replacement), or to generate text (write a cover letter), not the type of outputs that are verifiable within formal systems—by definition the type of output that RLVF is intended to improve. The actor analogy is still helpful for providing intuition to non-technical people who don't know how to think about LLMs, but do use them.

Also, unless I am mistaken, RLVF changes the training to make LLMs less likely to hallucinate, but in no way does it make hallucination impossible. Under the hood, the models still work the same way (after training), and the analogy still applies, no?


> Under the hood, the models still work the same way (after training), and the analogy still applies, no?

Under the hood we have billions of parameters that defy any simple analogies.

Operations of a network are shaped by human data. But the structure of the network is not like the human brain. So, we have something that is human-like in some ways, but deviates from humans in ways, which are unlikely to be like anything we can observe in humans (and use as a basis for analogy).


But being "objectively right" is not the goal of an actor.

Thus, why it's a good metaphor for the behavior of LLMs.


A better analogy is of an overconfident 5 year old kid, who never says that they don't know the answer and always has an "answer" for everything.


I'll steal that.


This is also related to the philosophical definition of bullshit[1]: speech intended to persuade or influence without any active intention to be either true or false.

[1] https://en.wikipedia.org/wiki/On_Bullshit


All models are wrong, some are merely useful - 1976/1933/earlier adage.


Right, all models are inherently wrong. It's up to the user know about its limits / uncertainty.

But i think this 'being wrong' is kind of confusing when talking about LLMs (in contrast to systems/scientific modelling). In what they model (language), the current LLMs are really good and acurate, except for example the occasional chinese character in the middle of a sentence.

But what we mean by LLMs 'being wrong' most of the time is being factually wrong in answering a question, that is expressed as language. That's a layer on top of what the model is designed to model.

EDITS:

So saying 'the model is wrong' when it's factually wrong above the language level isn't fair.

I guess this is essentially the same thought as 'all they do is hallucinate'.


Generally attributed to George Box


Intelligence in a way is the ability to filter out useless information. Be it, thoughts or sensory information


Yes, can't remember who said it but LLM's always hallucinate, it is just that they are 90 something percent right.


If I was to drop acid and hallucinate an alien invasion, and then suddenly a xenomorph runs loose around the city while I’m tripping balls, does being right in that one instance mean the rest of my reality is also a hallucination?

Because it seems the point being made multiple times that a perceptual error isn’t a key component of hallucinating, the whole thing is instead just a convincing illusion that could theoretically apply to all perception, not just the psychoactively augmented kind.


Which totally depends on your domain and subdomain.

E.g. Programming in JS or Python: good enough

Programming in Rust: I can scrap over 50% of the code because it will

a) not compile at all (I see this while the "AI" types)

b) not meet the requirements at all


This is the general idea of the Tangle newsletter [1]. They pick a topic from the news and provide "What the Right is saying" and "What the Left is saying" about the topic.

[1] https://www.readtangle.com/


Right and Left are a false dichotomy of strawmen. But such is US politics.


And somehow the "left" opinion is always actually a DNC centrist, while the "right" opinion is a member of the John Birch Society.


A group of coworker friends and I used to find the most creative ways to mess with each others' computers if they were left unattended. Some of the ones I can remember off the top of my head:

1. A Chrome extension that slowly rotated every <div> on the page (think 1 degree every 10 minutes). Would be hard to notice unless a page had been left open for a while. 2. Another Chrome extension that would redirect to a full-screen gif of John Cena that would only load after a random number of page loads. 3. A horn sound that would play after every successful git commit, regardless of computer volume. 4. A Slack bot that would scrape another coworker's insufferable Reddit comments, store them in a database, and then use ML to interject the comments into our conversations based on some basic NLP. This one was my favorite.


Just listened to the Search Engine podcast episode [1] where the author talked about this story. It's wild. The author (Joseph Cox) is also a founder of 404 Media[2], which is a great tech blog.

[1] https://www.searchengine.show/listen/search-engine-1/what-s-... [2] https://www.404media.co/


Oh this is fascinating. I'll have to keep an eye on the Tom Brady data since I was already doing something similar by hand for tracking all of his TDs [1]. I've wanted more detailed data for the data visualization piece of it [2] but it's a lot of work.

[1] https://tombradytds.com [2] https://tombradytds.com/viz.php


42


}}~


How is the government stealing your income or destroying the value of your property? Seems a bit hyperbolic.

And if the free market solves this, why are we in this situation in the first place? Shouldn't the free market have solved this already? Instead we have piles of empty houses/buildings and more homeless than ever before.


Because there is no free market in housing whatsoever.

Owning land doesn't give you the right to build anything. You need planning permission - which means permission from the local council, local homeowners and consultation, etc. which gives the NIMBY attitude so much power.

There aren't piles of empty houses. There aren't enough houses at all.


The free market doesn't work when there is extreme supply inelasticity, as is the case with land in desirable areas.


> And if the free market solves this, why are we in this situation in the first place? Shouldn't the free market have solved this already? Instead we have piles of empty houses/buildings and more homeless than ever before.

There is no 'situation'. Rational participants in the free market mostly have housing. The issue is that there is a widely available drug (fentanyl and meth too) that makes people behave irrationally, and thus the free market principles stop applying, since they presume a basic level of participant rationality. The fix from a government perspective is to remove the agency of those who are so drug addled that they cannot make good decisions.


"In 2018, Yiannopoulos told at least two news organisations who had requested comments that he wanted vigilantes to shoot journalists. He wrote in a text message 'I can't wait for vigilante squads to start gunning journalists down on sight'."


This is how I do it too. I build websites for other people for a living, so if I have a side project I just want to build something I like. And if I don't worry about making money with it, I don't feel the pressure to build for others and can do it for they joy of creating something. Which is why I got into this profession in the first place.


I just got finished listening to the most recent episode of Darknet Diaries this morning on the way into the office! It was about similar companies to the NSO group: https://darknetdiaries.com/episode/137/


I listened to the first half this morning. Was thinking about going back and watching the NSO group episode he mentioned again. Then I get to work, and the first thing I see is this link.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: