I just rewatched the end of the video to make sure I didn't miss anything. Deter...

wwilson · on April 17, 2024

It is absolutely possible to write a large stateful system from the ground up so that autonomous testing techniques can be applied to it. FoundationDB and Tiger Beetle are both examples of this, I think Resonate might be another one, and Toby Bell's talk at Strange Loop last year is a great guide on how to do so.

What's much harder is to take an arbitrary system, written in an arbitrary way, without these techniques in mind, and make it amenable to this sort of testing. From the start of our company, we believed that unless this was possible, the market would be too hard to crack, because most human beings are not very foresightful and not able to justify a bunch of extra work.

Hypervisor-based snapshot fuzzing like Nyx-Net and deterministic userspaces like Facebook's now-discontinued Hermit project are the other ways I know of accomplishing that goal. We believe that both of them have some pretty serious practical limitations which our approach does not share.

EDIT: Maybe the way to get to the crux of the disagreement is for me to turn the question around. Why do you believe that the vast majority of stateful and concurrent systems are not tested with fuzzing?

mrkmarron · on April 17, 2024

At the end of the day you have 2 problems (1) how to make execution deterministic within some boundary be it a process, hypervisor, or distributed system and (2) how you handle non-determinism when data crosses this boundary. You can move the boundaries and effort around but the problems always exist. So, if you are claiming that you have a sweet spot on this tradeoff then I could certainly believe that, if you claim that you eliminated this boundary issue then I am highly credulous.

I'll agree with you on indeterminate behaviors though, I suspect they will eventually be seen like the "billion dollar" mistake of null pointers.

mrkmarron · on April 17, 2024

Just saw the edit. I have 2 answers:

1) Fuzzing is under-utilized even for simple code. AFL is dead easy to use and, even so, most projects don't have it in CI runs. So, despite how much I like it, in general it seems people do not see value in this type of testing.

2) The effort to handle external state (say a restful call to get stock ticker info) needs to be mocked -- which is deeply unpopular -- or handled by record/replay which works ok-ish but eventually breaks down with divergences. Outside of well-chosen domains it these eventually pop-up and add an additional pain point that builds on item 1.

amw-zero · on April 18, 2024

Deterministic execution might be well understood by its proponents, but it's a completely niche technique that practically no one uses in practice. You have this jaded tone like this is something that everyone is doing, and everyone knows that this isn't true, so we're curious... why are you writing these things?