Edit: reading the paper, I'm no longer sure about my statement below. The algorithm they introduce claims to do this: "We now show how this property can be used
in practice to reconstruct the exact input prompt given hidden states at some layer [emp. mine]". It's not clear to me from the paper if this layer can also be the final output layer, or if it must be a hidden layer.
They claim that they can reverse the LLM (get prompt from LLM response) by only knowing the output layer values, the intermediate layers remain hidden. So, Their claim is that indeed you shouldn't be able to do that (note that this claim applies to the numerical model outputs, not necessarily to the output a chat interface would show you, which goes through some randomization).
> If I am understanding this paper correctly, they are claiming that the model weights can be inverted in order to produce the original input text.
No, that is not the claim at all. They are instead claiming that given an LLM output that is a summary of chapter 18 of Mary Shelley's Frankenstein, you can tell that the input prompt that led to this output was "give me a summary of chapter 18 of Mary Shelley's Frankenstein". Of course, this relies on the exact wording: for this to be true, it means that if you had asked "give me a summary of chapter 18 of Frankenstein by Mary Shelley", you would necessarily receive a (slightly) different result.
Importantly, this needs to be understood as a claim about an LLM run with temperature = 0. Obviously, if the infra introduces randomness, this result no longer perfectly holds (but there may still be a way to recover it by running a more complex statistical analysis of the results, of course).
Edit: their claim may be something more complex, after reading the paper. I'm not sure that their result applies to the final output, or it's restricted to knowing the internal state at some pre-output layer.
> their claim may be something more complex, after reading the paper. I'm not sure that their result applies to the final output, or it's restricted to knowing the internal state at some pre-output layer.
It's the internal state; that's what they mean by "hidden activations".
If the claim were just about the output it'd be easy to falsify. For example, the prompts "What color is the sky? Answer in one word." and "What color is the "B" in "ROYGBIV"? Answer in one word." should both result in the same output ("Blue") from any reasonable LLM.
Even that is not necessarily true. The output of the LLM is not "Blue". It is something like "probability of 'Blue' is 0.98131". And it may well be 0.98132 for the other question. Certainly they only talk about the internal state in 1 layer of the LLM, they don't need the entire LLM values.
The point I'm trying to make is this: the LLM output is a set of activations. Those are not "hidden" in any way: that is the plain result of running the LLM. Displaying the word "Blue" based on the LLM output is a separate step, one that the inference server performs, completely outside the scope of the LLM.
However, what's unclear to me from the paper is if it's enough to get these activations from the final output layer; or if you actually need some internal activations from a hidden layer deeper in the LLM, one that does require analyzing the internal state of the LLM.
The LLM proper will never answer "yes" or "no". It will answer something like "Yes - 99.75%; No - 0.0007%; Blue - 0.0000007%; This - 0.000031%" etc , for all possible tokens. It is this complete response that is apparently unique.
With regular LLM interactions, the inference server then takes this output and actually picks one of these responses using the probabilities. Obviously, that is a lossy and non-injective process.
If the authors are correct (I'm not equipped to judge) then there must be additional output which is thrown away before the user is presented with their yes/no, which can be used to recover the prompt.
It would be pretty cool if this were true. One could annotate results with this metadata as a way of citing sources.
Why do people not believe that LLMs are invertible when we had GPT-2 acting as a lossless text compressor for a demo? That's based on exploiting the invertibility of a model...
I was under the impression that without also forcing the exact seed (which is randomly chosen and usually obfuscated), even providing the same exact prompt is unlikely to provide the same exact summary. In other words, under normal circumstances you can't even prove that a prompt and response are linked.
I'm under the impression that seed only effects anything if temperature > 0. Or more specifically that the LLM given a sequence of input tokens deterministically outputs the probability for each possible next token, and then the only source of randomness is in the procedure for selecting which of those next tokens to use. And that temperature = 0 means the procedure is "select the most likely one" with no randomness at all.
The seed and the actual randomness is a property of the inferencing infrastructure, not the LLM. The LLM outputs probabilities, essentially.
The paper is not claiming that you can take a dump of ChatGPT responses over the network and figure out what prompts were given. It's much more about a property of the LLM internally.
I think their claims are limited to the "theoretical" LLM, not to the way we typically use one.
The LLM itself has a fixed size input and a fixed size, deterministic output. The input is the initial value for each neuron in the input layer. The LLM output is the vector of final outputs of each neuron in the output layer. For most normal interactions, these vectors are almost entirely 0s.
Of course, when we say LLM, we typically consider infrastructure that abstracts these things for us. Especially we typically use infra that takes the LLM outputs as probabilities, and thus typically produces different results even for the exact same input - but that's just a choice in how to interpret these values, the values themselves are identical. Similarly on the input side, the max input is typically called a "context window". You can feed more input into the LLM infra than the context window, but that's not actual input to the model itself - the LLM infra will simply pick a part of your input and feed that part into the model weights.
> You are just saying "well ackshually". I dare you to build a cabinet using the Hamiltonian. I double-dog-dare you.
The Hamiltoninan (and Lagrangian) are much more amenable to actual physical calculations, at least on a computer, than the Newtonian formulations of classical mechanics - but otherwise they are perfectly equivalent mathematically. I'm not sure where you'd need any kind of dynamical laws in the building of a cabinet, on the other hand. Are you trying to arrange for a system of inclined planes and pullies to slot the pieces into place?
> Perhaps more accurately, our senses mostly believe pre-Newtonion approximations, which is why it took until Newton to realize how inaccurate they were.
This is a bit of misnomer. Our senses and intuitions are in fact remarkably accurate for a certain range of values, and quite equivalent to what Newton's laws of motion say about these. To some extent, Newton "only" found a simple formalism to represent our existing intuitions. Our intuitions of course break down in other places, such as at very high speeds, or very high altitudes , where relativistic corrections start to become significant.
QM however is a paradigm shift in how the world is described, and it is completely non-intuitive, even in regimes where its predictions are fully aligned with our intuitions and senses. You can use QM to compute the collision of two ideal balls on an ideal plane, and the results will exactly match your intuitions. But the computation and even the representation of the system will not, in any way.
Most of these things are misunderstandings of quantum mechanics, as we know it today.
The main thing that is at the root of all of them is the word "things". In QM, the ground truth of the world is the wavefunction of the system. The wavefunction assigns some amplitude (potentially 0) to any possible state of the system that it describes. It then evolves purely deterministically from past to future, according to Schrodinger's equation (or Dirac's equation, if you want to discuss speeds close to that of light). The only kink is interaction with a measurement device (what constitutes a measurement device is one of the big mysteries that we don't yet have an answer for). After a measurement, the wavefunction collapses non-deterministically to one of the states that the measurement device was set up to detect, with a probability that is proportional to the amplitude of the wave function of that state.
Now, this is the "ground truth" of QM. Everything else, such as particles and space-time and so on are just stories we tell to make sense of the wavefunction and its behavior. Sometimes those descriptions break down, and they start assigning weird fanciful ideas, such as retrocausality etc - but these just prove that the stories are wrong, that they are misinterpreting the math of the wavefunction.
I'd also note that the main "time is weird" factoid you encounter related to QM experiments, the delayed-choice quantum eraser, is mostly a misunderstanding / sensationalization of the actual physics and the experiment. It again only proves that certain interpretations of what the wavefunction and its collapse represent are not temporally consistent, but the direct conclusion from this should be that the interpretations are wrong, not that "cause and effect goes for a toss, as behavior of time is different".
If we can't even define a "thing", by identifying inside and outside of it, what it is and what it is not, where it is and where it is not, that itself a big contrast with Newtonian (human scale) mechanics. Everything one can talk about Quantum mechanics is coompletly alien to the human perceived world. That should justify the distinction by scale.
Sold in, not sold to. The GP meant: if you consider it legitimate to sell your product in Myanmar, you should obey the laws of Myanmar. If you consider the government is illegitimate, don't do business there.
Starlink has the precise terminal location and gets paid for the subscription for that terminal. They know where it is and who pays for it. From the article they say that they were selling a service there and stopped in order to comply with local laws:
> SpaceX proactively identified and disabled over 2,500 Starlink Kits in the vicinity of suspected ‘scam centers.'”
I think the point (which you seem to have missed) is: How do you distinguish between a terminal under the control of a scam center versus, say, a journalist who has traveled to the vicinity of the call center to interview people and make a report (The Economist recently had an excellent series of articles about these call centers).
Neither terminal was bought in Myanmar. Both have been transported to and used in the vicinity of the scam center. The difference is purely the intent of the person controlling the terminal. But you can't infer that intent from only the location where it was purchased and the precise location where it is being used.
> > SpaceX proactively identified and disabled over 2,500 Starlink Kits in the vicinity of suspected ‘scam centers.'”
Sure, because it's currently in the news and it's any easy way to say "we fixed the problem". Maybe some Economist journalist just lost internet access. Oh well. Guess they'll have to find their way out of Myanmar without internet. Sucks to be them, right?
> How do you distinguish between a terminal under the control of a scam center versus, say, a journalist who has traveled to the vicinity of the call center to interview people and make a report.
You are told by the local law enforcement and legal system? Starlink's obligation is only to assist local authorities as per their law. Maybe the local authorities are corrupt but that doesn't give Starlink a free pass from obeying their law.
> Neither terminal was bought in Myanmar.
Does it matter? Starlink does business there, in Myanmar. They offer an internet service. They were asked by the authorities to disable some terminals, and because they want to keep offering the service to other paying customers, they complied. There's no legal grey area here, not even a moral conundrum for Musk. He follows the law of the land, gets to still do business and make more money.
Point being, as long as Starlink wants to keep offering a service and make money in Myanmar the company has to obey local laws. The statement below [0] that started the thread was a kneejerk reaction, keyboard warrior style. Musk "didn't give the time of day" to Brazilian authorities and he was squeezed into compliance. Why fight when there's an easy way to keep making money?
> But the US (who has jurisdiction over Starlink) isn't bound by Mynamar laws, and (IMHO) shouldn't give the time of day to the requests of a junta
What if a "legitimate" government is committing genocide, as Mynamar's is? Should international companies respect its sovereign laws?
This thread baffles me, that people are somehow capable of ignoring the elephant in the room of the massacring of civilians, to tunnel-vision instead on some trivial and insignificant technicalities about satellite law.
> What if a "legitimate" government is committing genocide, as Mynamar's is? Should international companies respect its sovereign laws?
Yes. The answer is not to act lawlessly, but instead to not be in that country at all or be there and apply pressure for change. But breaking the laws in ad hoc ways is not the way.
Several international companies have divested or exited due to political risk, sanctions, or human rights concerns.
> people are somehow capable of ignoring the elephant in the room of the massacring of civilians
To consider, the following countries, amongst others, retain embassies in Myanmar: Australia, Brazil, China, Egypt, France, Germany, India, Israel, Japan, Nepal, Singapore, UK, USA.
> The answer is not to act lawlessly, but instead to not be in that country at all or be there and apply pressure for change.
Oh, that is the novel idea. For people being genocided to not be there and for those who are against genocide to let themselves be killed in the first step.
> But breaking the laws in ad hoc ways is not the way.
Breaking the laws is frequently necessary in the genocide situation, because the laws were designed to create and facilitate the genocide. Genocides do not just happen out of nothing.
>> The answer is not to act lawlessly, but instead to not be in that country at all or be there and apply pressure for change.
> Oh, that is the novel idea. For people being genocided to not be there and for those who are against genocide to let themselves be killed in the first step.
>> But breaking the laws in ad hoc ways is not the way.
>Breaking the laws is frequently necessary in the genocide situation, because the laws were designed to create and facilitate the genocide. Genocides do not just happen out of nothing.
My response was to this question: "Should international companies respect its sovereign laws?"
Nothing about the people of Myanmar.
My answer is different if you're a Myanmar person. But you still face the moral question of which laws you should disregard vs. which to follow.
Agreed. I think I have an explanation (a partial one, at best). The tech world is so adept at abstraction that we have made it one of our primary tools in the box. Everything gets abstracted away until we have a nice, clean, uniform representation of the underlying item. Whether that item is people, vehicles, road accident data or private communications doesn't really matter any more once it is abstracted. Then it's just another record.
Ethics and other moral angles no longer apply, after all, how could those apply to bits, that's for 'real' engineers. It's also at the core of the HN "'no politics', please." tenet.
I see a similar deficiency in the legal profession, they too tend to just focus on the words and the letters and don't actually care all that much about the people.
> What if a "legitimate" government is committing genocide
That's an interesting question, I'll say. I can't say yes or no but I can say that the answer should be consistent. You either support genocidal regimes, or you don't.
So you have Starlink operating in Israel and in Myanmar.
> that people are somehow capable of ignoring the elephant in the room of the massacring of civilians, to tunnel-vision instead on some trivial and insignificant technicalities about satellite law.
Imagine the bafflement when some people stick to their tunnel vision while writing about other people's tunnel vision on the same exact topic.
Which would be very relevant if anyone were trying to sue them for this - which no one is.
The license establishes the limits of legal requirements and responsibilities. It doesn't shield you from criticisms and people being annoyed with you.
You realize that if relations really broke down to that level, then most likely NATO would simply exclude the USA, it wouldn't be all of the other NATO members leaving it, right? Also, if it gets to the level that the EU / Europe would feel they no longer want a military alliance with the USA, then it's likely that they wouldn't want to counter just Russian and Chinese influence, but US influence as well - there's nothing inherently better about US influence than the other two (Chinese influence so far has been the least bad outside of China itself, out of the three, by far - though I have no delusion that this will continue as China's power grows).
The biggest problem with Starlink's proposed solution would be that it would have been B2C - people in Greenland would talk to other people in Greenland through Starlink's satellites. That would put communication inside Greenland at the whims of another foreign power, which is a whole different level of loss of sovereignty than getting communication with the rest of the world cut off.
They claim that they can reverse the LLM (get prompt from LLM response) by only knowing the output layer values, the intermediate layers remain hidden. So, Their claim is that indeed you shouldn't be able to do that (note that this claim applies to the numerical model outputs, not necessarily to the output a chat interface would show you, which goes through some randomization).