Could you elaborate on that last point regarding elixir and why type systems don...

jerf · on May 11, 2018

I like to use a cellular metaphor for programs, where you convert incoming data to either internal representations or errors as soon as possible, and delay converting to external representations as long as possible on the way out back into the "unfiltered world", but inside the cell things are considered "clean" and trusted. Type systems fit this point of view very naturally, because they can define that "clean view" of the world and enforce that you do the conversions early, because otherwise you won't have the target types. However, a "cell" has to be local, within a context where for instance it can count on having a stable view of the types and what they can do. (This being Erlang/Elixir, we can talk about upgrading running processes, but in that case, you have to provide a conversion function as well, though the default and common one is the identity conversion. Still, there is a conversion process.)

Once you have a system with multiple contexts, which you can't help but have if you're crossing machines but even just across OS processes you ought to behave as if you have a separate context, the type system is less helpful. You can't take a message from a foreign node of type X!foreign and simply assume it's of type X!local, because you do not know in the general case whether the foreign and local concept of X is the same thing. Perhaps they are different versions. Perhaps they aren't even the same program. At a sufficiently large scale or with sufficiently bad luck, perhaps the foreign node is actively lying to you in a attempt to hack you.

You end up having to serialize from the foreign node to the local one, and then deserialize on the local node, in order to communicate. Especially in a world where nodes can upgrade their code without so much as dropping their TCP connection to you or something. So there is a point of view from which having strong types at the language level isn't all that helpful to this process, because even having strong types doesn't let you escape from these issues. You can do all sorts of automated stuff to try to escape from the code boilerplate aspects of this problem, but you can not fully wrap the semantic problem of "this message, no matter how hard we try to abstract this away for you, may be invalid by local standards".

How much this matters to you depends on how distributed your system is. If you've got a 3-node Erlang/Elixir cluster that you fully control all the code for, you might be able to get away with just ignoring the issue. Like many other programming issues, at small enough scales it doesn't matter and you can just ignore it. But as you scale up it becomes an inevitable problem you must program for; consider something like an AWS API for S3 or something, served by a few different servers and consumed by thousands of clients using every possible version of every SDK Amazon has ever put out and who knows how many home-grown implementations. Types can define the communication contract at that level, but provide zero guarantees to a server about what it's going to receive.

m0meni · on May 11, 2018

That's an amazingly comprehensive answer. Thanks!

So basically, when interacting with the outside world at scale, it's very hard to make any sort of guarantees about kind of data you're working with. And since you have to validate the shape of your data already, you're doing the job of the type system yourself, which makes it less useful to have one. Did I get that right?

Aren't these issues only at the boundaries where you're receiving data from the outside world? Wouldn't it still be helpful to have a type system/static analysis at the local level?

jerf · on May 11, 2018

"And since you have to validate the shape of your data already, you're doing the job of the type system yourself, which makes it less useful to have one."

I'd say it's more like you don't get the imaginary benefit of being able to skip that checking.

And yes, a type system can absolutely help you with this, especially in terms of enforcing that you do this conversion (since the incoming byte[] stream is not going to be any of the types you actually want, even at the brute "int" or "char" level, let alone user-defined types). But there is certainly a sense in which this doesn't especially help you with dealing with remote nodes.

brightball · on May 11, 2018

It looks like it's already been answered pretty thoroughly. The cell analogy works really well.

I tend to describe it more akin to REST vs WSDL regarding the necessity of exchanging contracts between WSDL participants when there is a change to adhere to the structure.