I always wonder who's watching the watchers with encryption. I know there are standard, recommended libraries available. But how are those vetted if mere mortals like myself don't even know what known vulnerabilities to look for? Is there like a high council of crypto wizards that keep checks and balances on each other?
Cryptography engineers and vulnerability researchers who specialize in cryptography look for vulnerabilities in the popular libraries. Cryptographic vulnerabilities are high-status (if you find a pattern of them, you can even get published in the Big 4 academic venues). So there's a lot of incentive to find stuff, at least in popular software.
It's a much bigger problem for bespoke crypto people roll themselves, because the expertise required to find crypto vulnerabilities is uncommon, and there's not that much incentive at all to find a vulnerability in something nobody uses.
> Is there like a high council of crypto wizards that keep checks and balances on each other?
Not exactly, but certainly there are some crypto wizards that would love to get famous for figuring out how to break the so-called trusted algorithms and libraries.
I think this "high council of crypto wizards" is academia, at the end of the day we are talking about advanced math and i think there are many experts on that field eager to prove and correct each other.
Whenever I'm roleplaying with my tinfoil hat, I say that the NSA is behind the "never implement your own crypto" advice.
If you're really worried, roll your own encryption, and then also run the encrypted message through standard encryption implementations. After all, if the standard implementations work, it doesn't matter if you mess up your own crypto implementation, because the standard implementation is unbreakable.
> After all, if the standard implementations work, it doesn't matter if you mess up your own crypto implementation, because the standard implementation is unbreakable.
This sounds like obviously good advice but it actually isn't, if your implementation is vulnerable to timing attacks and is used as the first layer, it can potentially reveal (parts of) your plaintext. Vulnerabilities like this is exactly why "don't roll your own crypto" is good advice
> if your implementation.. is used as the first layer...
Don't you have this backward? The inner-most (first) layer would be the unverified roll-your-own, the outer layer would be standard. Outer layers are the first to be penetrated to get to the plaintext, not the inner layers... I'm intuiting as a non-crypto guy so please correct me if wrong.
The point of a timing attack is that some operations take different amounts of time depending on the plaintext and/or key. Depending on the attack in question and your access to the system, this potentially lets you draw conclusions about what was encrypted by observing how quickly the system responds to various requests.
If the custom algorithm is an outer layer, it only processes data that has already been encrypted by another presumably strong algorithm. Even if there's a timing attack, breaking the outer layer can't help you unless you can also break the well-studied inner layer. If the custom algorithm sees the actual plaintext directly, the timing attack can let you straight to the original message, no matter how strong any of your outer layers are.
Indeed, if the home-made RSA algorithm is the outer layer, and someone manages to break it, then we're back to the status quo. And the status quo is unbreakable encryption, right?
On the other hand, if the home-made RSA algorithm is part of the inner layer, how is it different than any of the other poorly designed code that we use in the user facing side of encryption? If an API call goes through 100,000 lines of business logic, and the result gets encrypted and sent as an API response, that's okay, it happens billions of times per day. In terms of security, what does it matter if a few of those 100,000 lines are a home-made RSA implementation?
For example, if the outer algorithm takes a fixed amount of time and the inner algorithm takes longer for certain messages then by looking at the total time taken you can determine the content.
Whenever I'm roleplaying with my tinfoil hat, I say that the NSA is behind comments like yours.
You cannot assume that the security of a composite crypotsystem is max(system1, ..., systemN), i.e. that bolting a secure system to an insecure one is at least as secure as the most secure system. Sometimes it is; sometimes it is not and the insecure system breaks the whole thing in a way that casual analysis can't spot. If I were the NSA trying to inject memes into the software ecosystem to make my job easier, I would definitely recommend that everybody start with vetted open-source cryptosystems and then bolt their hand-rolled crap to them.
> You cannot assume that the security of a composite crypotsystem is max(system1, ..., systemN), i.e. that bolting a secure system to an insecure one is at least as secure as the most secure system.
But, the scenario described - that you've responded to - isn't the max of an unordered system; it is the max of a composite system with a specific order. In other words, perceived weakest system is run always before perceived strongest system.
And, if you are saying it still doesn't matter, and if payload1->weak_crypto->strong_crypto causes a vulnerability, what keeps the same vulnerability from occurring naturally with some payload2->strong_crypto that doesn't use weak_crypto?
In other words, why would you choose strong_crypto if it has vulnerabilities with certain payloads but not other payloads? If these vulnerabilities are truly known for certain payloads, then why not inject a decision to avoid strong_crypto if the payload is known to cause vulnerabilities?
Apologies for the first-grade level questions and fervor. Maybe this is why crypto is hard. Or maybe things that are this hard to explain and understand should just not be depended on. I've experienced both.
If it's true that modifying my plaintext before putting it through a standard encryption algorithm might affect the security, then it might also be possible that sending, say, an HTML payload is more secure than sending a JSON payload. That would be weird.
Can existing encryption algorithms encrypt any payload or can't they? It would be odd if we're worrying about timing attacks, etc, when we can't even securely encrypt any possible plaintext first.
Yes, a correct encryption algorithm can encrypt (essentially) any bit string. But it's quite easy to turn a correct encryption algorithm into an incorrect one by bolting on something seemingly innocuous.
Here's a concrete example. Let's say you decide you want to make AES encryption more efficient by defining a new standard, "lzAES", that is just:
This "works", in the sense that you can correctly decrypt ciphertexts, and it certainly seems innocuous. But it is now an insecure cipher!
Here's why: the definition of a secure cipher is that ciphertexts resulting from the encryptions of any two messages of equal length are indistinguishable. In other words, for any two messages you can choose, if I encrypt both messages under a randomly generated key, you can't tell which ciphertext corresponds to which message. In contrast, lzAES as defined above does not have this property: you could choose one message that's easily compressible (say, a string of zeros) and one that's not (say, a uniformly random string), and then you'd be able to tell which ciphertext corresponds to which plaintext just by looking at the ciphertext lengths.
And this is not just a definitional issue! If you use lzAES to encrypt something, an attacker can guess your message and test whether compressing the guess gives the same length as your ciphertext. Guess-and-check isn't possible with a secure cipher, but it is with lzAES---in other words, it gives away information about your plaintext!
Thank you for taking the time to explain, your explanation is clear and gives me something to think about. But I have a question about:
> But it's quite easy to turn a correct encryption algorithm into an incorrect one by bolting on something seemingly innocuous.
Isn't every poorly designed web app essentially a giant "bolt on" to the encryption algorithm (HTTPS, etc) it is served through?
If there's a get_thread API, which zips the comment thread, includes it in some JSON as base64 along with other metadata (only the thread itself is zipped), and then sends that as the response over HTTPS, is that not secure? Nobody would bat an eye at this scenario, but it's essentially the same as your example because the plaintext is compressed before encrypting and sending. If it's okay to do this for a web app, why is it not okay to do it as part of a home-made RSA implementation.
(Of course, I'm not actually arguing for a second layer of encryption because it is unnecessary. But my understanding is that it wouldn't cause any harm and I'm trying to understand if that's correct or not.)
The example you give is similar to but not quite the same as "lzAES". The distinction is that in your example, the application is deciding whether to compress or not---the input/output behavior of the cipher doesn't include the compression step, so the cipher it self doesn't suffer from the problem I mentioned in my first note.
But it's still possible for an application to use a cipher incorrectly. In particular, an application-level decision about whether to compress some data before encrypting can have an effect on the application's security. In the case you mention it seems unlikely to be a problem (but that's an application-level question, so it could be).
As an example where it seems like the application-level decision to compress or not matters a lot, imagine an application that sends an encrypted password to a server. If the application compresses the password first, an attacker could learn which values are not my password via guess-and-check-length. (Of course, even without compression the attacker can learn something about the length of my password just by looking at the length of the ciphertext---so probably this is a case where the application should first pad the message to some fixed length before encrypting. But in any case it almost certainly shouldn't compress the password!)
You should look into the CRIME, BEAST, and BREACH attacks on TLS/SSL. They're related to using compression before encryption.
The TL;DR is that you generally should not compress secret data before encrypting, especially if part of the request might be reflected in the response.
If you look carefully at your browser's HTTPS traffic, you'll notice that dynamic data is never sent using HTTP compression, though static (basically guaranteed to not contain anything secret) data might still use it.
As a rule of thumb, pay attention to crypto parameters and cipher 'suites'. Use the highest SHA, use seven word diceware phrases for the password, ensure the latest TLS version is used, use a reputable & robust RNG, etc
If you don't know what you're doing SHA-512/256 (note that's not a choice, that's the name of a single SHA-2 family member) is probably the member of the SHA-2 family to choose.
My feeling is that like in 2001 it would have been valuable to get people to switch to a non-extendable hash by default because people were freelancing their own MACs, but sometime in the intervening 2 decades people switched fully over to HMAC, so that if you're dealing with someone who is literally writing their own prefixed key hash MAC, you've got bigger problems than Merkle Damgard.