Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Engaging with why we might actually want inflation of text:

1) For pedagogical or explanatory purposes. For example, if I were to write:

> ∀x∈R,x^2≥0

I've used 10 characters to say

> For every real number x, it's square is greater than or equal to zero

For a mathematician, the first is sufficient. For someone learning, the second might be better (and perhaps as expansion of 'real number' or that 'square' is 'multiplying it by itself').

2) To make sure everything is stated and explicit. "He finally did x" implies that something has been anticipated/worked on for awhile, but "after a period of anticipation he did x" makes it more clear. This also raises the question of who was anticipating, which could be made explicit too.

As someone who spends a lot of time converting specifications to code (and explaining technical problems to non-technical people), unstated assumptions are very prevalent. And then sometimes people have different conceptions of the unstated assumption (i.e. some people might think that nobody was anticipating, it just took longer than you'd expect otherwise).

So longer text might seem like a simple expansion, but then it ends up adding detail.

I definitely agree with the authors point, I just want to argue that having a text-expander tool isn't quite as useless as 'generate garbage for me'.



Can a generator do things like 2 if it wasn't in the input text?


Ambiguity is resolved via provided context, but just as with conversations, this context may be severely underspecified.


Yes, because generators generate at the token level, which is technically smaller than an individual word. They can easily generate unique sentences, and for example transfer learning allows them to apply knowledge obtained from some other training data to new domains.

The idea that generators are some sort of parrot is very outdated. The 2021 paper that coined the term "stochastic parrot" was already wrong when it was published.


> Yes, because generators generate at the token level, which is technically smaller than an individual word. They can easily generate unique sentences, and for example transfer learning allows them to apply knowledge obtained from some other training data to new domains.

Sure. But can they read the original author's mind, and therefore generate the right unique sentence that expresses the actual intent?


Sure, but in the case of "he finally did X" without context passed in, how does the llm determine if it would be expanded as "this was a very anticipated change" or if the author is frustrated at how long it took? If the nuanced meaning isn't there in the input context.

obviously it can generate the longer message, but is it going to go look up what the sentence refers to and infer extra meaning automatically...?


If I need it expanded I can put it into my LLM myself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: