Author of the linked CL here: we added this mostly so that we could abuse the memory initialization tracking to test the constant-time-ness of crypto code (similar to what BoringSSL does, proposed by agl around fifteen years ago: https://www.imperialviolet.org/2010/04/01/ctgrind.html), which is an annoyingly hard property to test.
We're hoping that there are also a bunch of other interesting side-effects of enabling the usage of Valgrind for Go, in particular seeing how we can use it to track how the runtime handles memory (hopefully correctly!)
edit: also strong disclaimer that this support is still somewhat experimental. I am not 100% confident we are properly instrumenting everything, and it's likely there are still some errant warnings that don't fully make sense.
This is super cool. Hopefully it will flush
out other issues in Go too.
But I wonder why its not trivial to throw a bunch
of different inputs at your cyphering functions and measure
that the execution times are all within an
epsilon tolerance?
I mean, you want to show constant time
of your crypto functions, why
not just directly measure the time under
lots of inputs? (and maybe background Garbage
Collection and OS noise) and see how constant
they are directly?
Also some CPUs have a counter for conditional
branches (that the rr debuger leverages), and you
could sample that before and after and make
sure the number of conditional branches does
not change between decrypts -- as that AGL post
mentions branching being the same is
important for constant time.
Finally, it would also seem trivial to track
the first 10 decrypts, take their maximum
time add a small extra few nanoseconds tolerance,
and pad every following decrypt with
a few nanoseconds (executing noops)
to force constant time when it is varying.
And you could add an assert that anything over that
established upper bound crashes the program since it
is violating the constant time property. I
suppose the real difficulty is if the OS deschedules
your execution and throws off your timing check...
For the best crypto, you don’t want “within an epsilon”, you want “exactly the same number of CPU cycles”, because any difference can be measured with enough work.
Feeding random inputs to a crypto function is not guaranteed to exercise all the weird paths that an attacker providing intentionally malicious input could access. For example, a loop comparing against secret data in 32 bit chunks will take constant time 99.99999999% of the time, but is still a security hole because an attacker learns a lot from the one case where it returns faster. Crypto vulnerabilities often take the form of very specifically crafted inputs that exploit some mathematical property that's very unlikely from random data.
> But I wonder why its not trivial to throw a bunch of different inputs at your cyphering functions and measure that the execution times are all within an epsilon tolerance?
My guess is because the GC introduces pauses and therefor nondetermism in measuring the time anything takes.
Because "constant time" here means algorithms that are O(1), rather than O(n). This isn't about wall-clock execution time, it's about avoiding the number of operations performed being based on attributes of the input.
I think it's more complicated than that. Certain things like branches, comparisons, and some other operations may not be constant time especially if you consider interactions with prior code. It's clearly possible just really difficult to get right.
We're hoping that there are also a bunch of other interesting side-effects of enabling the usage of Valgrind for Go, in particular seeing how we can use it to track how the runtime handles memory (hopefully correctly!)
edit: also strong disclaimer that this support is still somewhat experimental. I am not 100% confident we are properly instrumenting everything, and it's likely there are still some errant warnings that don't fully make sense.