I was nodding along until I realized that the author has a very different notion of esoteric from mine.
The last month of Fridays I've been tinkering with Idris, trying to get it to... well, do anything useful. I'd heard it was a dependently typed JVM language, which it is... kinda. Turns out you need to install a full Haskell toolchain to do anything with it. With the right esoteric option incantation, you can make it build an executable that turns out to be a shell script concatenated with a jar file. Nice. But you can't build a library with it, even within the library. You can't invoke it to build some classes to use later. You can't even call the build tool from anything remotely standard in JVM-land, and when I asked about making the compiler selfhosting the response was a kind of lukewarm "yeah, sometime". The current toolchain is in Haskell but there's not even an FFI from Idris into Haskell, so porting it would be... challenging. I tried to join the two together via their C FFI, but in between undocumented linker options and the fact that the Idris runtime is already linked to part of the Haskell runtime, I eventually gave up. If nothing else, it gave me a real appreciation for how much work went into making Scala a serious, commercially usable language, something I'd previously rather taken for granted.
But C, holy shit, C? You'd write a program in C? And expect it to work? I've seen programs go wrong in a lot of languages, but most of them can be eventually fixed. In C you get irreproducible voodoo random crashing that, sure, you can usually track down with a static analyzer, valgrind, debugging, intelligence and luck. But what if you couldn't? What if the program was just broken, and you couldn't fix it? With $100,000,000 on the line, there is no way on earth I would risk letting C (or any other unmanaged language) anywhere near my codebase. It might mean more work, and less features, if I were to use e.g. that pure-OCaml SSL library, or a JVM-native multimedia library. But I'd do it in a heartbeat, all the same. "The final say on overall data sizes" is such a tiny, trivial concern compared to using a language where failure is understandable, reliably diagnosable.
(In fact I'd say the scenarios where a measly factor of 2x memory consumption makes the difference between "working" and "not working" are just vanishingly narrow. If you need to scale horizontally in OCaml, you're going to need to scale horizontally in C a couple of weeks later. Particularly with $100,000,000 on the line, sod it, buy a bigger server with more RAM if using 32GB rather than 64GB really makes the all-important difference).
Would I use a "pet language" I tinker with, like Idris? No, but I wouldn't use that for any kind of serious commercial work. With $100,000,000 on the line, I'd use the same language I use for almost all my work: Scala. And I would stay the hell away from JNI, because I don't want C anywhere near this system, lest it bring it all crashing down.
There are a lot of tools available for the design of reliable systems in C. Valgrind is just the beginning. A lot of smart people have put an enormous amount of effort into static and dynamic checkers and theorem provers, as well as standards and guidelines for what you need to do in order to write reliable systems in C. These systems have hard realtime and hard memory requirements.
This is how it works at the extreme end of reliability requirements: you avoid Java and use C, because you'd have to validate the Java runtime anyway, much of which is written in C, and it's just plain easier to validate your C code under an otherwise crippling set of constraints.
And then, as ill as you speak of C, you turn around and plug your Java systems into C systems, such as Varnish, Nginx, Apache, PostgreSQL, or thousands others that you use every day, not to mention the OS itself.
Yes, I agree that it is crazy to write your web app in C. But web apps are not the entire world.
At the point where you're using a theorem prover you're not really writing C any more (I mean, do you count ATS as writing C?). It's a perfectly good way to produce reliable code, sure. But I'm pretty sure it's not what the article is advocating.
> And then, as ill as you speak of C, you turn around and plug your Java systems into C systems, such as Varnish, Nginx, Apache, PostgreSQL, or thousands others that you use every day, not to mention the OS itself.
I do my best to minimize the C surface, and I worry about what I am exposing. E.g. I don't put Apache/Nginx/etc. anywhere in my stack (I either route directly to the JVM or use an Erlang load balancer), which means I wasn't running around patching them in the wake of Heartbleed. I'd use e.g. Riak over PostgreSQL wherever possible.
I'm just taking two lines from the article: "Reliability and proven tools are even more important than libraries"... in this case, there are a lot of proven tools for writing C programs, including the theorem provers. And "You're more dependent on the decisions made by the language implementers than you think"... there have been a lot of flaws in the JRE over time, most of them from the parts of the JRE that are written in C. If you had been writing in C, you could have avoided those flaws if you were in the top 2% most diligent and careful C programmers, because the C tools are rock solid, whereas Java tools haven't been (January 2013 is still pretty recent).
Problems in libraries, such as heartbleed, can be mitigated in C using the same tools that you'd use in Java. Process isolation is great.
Of course, if you are writing your web app in C then I am going to replace you with someone else. But the web is not everything.
The January 2013 flaws affected end users running the browser plugin, not systems written in Java. I suppose if you had a system that allowed users to submit their own bytecode and used Java's native facilities for handling this, then you'd be vulnerable. But nothing forces you to do that; someone who's capable of writing a safe bytecode verifier in C is certainly capable of writing one in Java.
The only even vaguely recent flaw I can remember in the JRE was, as you say, in the part written in C (I think in some image loading code). It takes a very perverse kind of logic to say: this system which is mostly not-C and a tiny bit C keeps having security flaws in the C part, therefore we should switch to systems which are written more in C. If the tooling for verifying C is really so good, why not verify the C parts of the JRE? Then they'd be proven once and for all, and lots of systems would benefit.
> If the tooling for verifying C is really so good, why not verify the C parts of the JRE? Then they'd be proven once and for all, and lots of systems would benefit.
I'm going to quote the article again...
> You're more dependent on the decisions made by the language implementers than you think.
When you use Java, you don't have the opportunity to second-guess the choices that produced the JRE. And I think you're not quite getting what I'm saying: I'm not saying that "we should switch to systems which are written more in C", I'm saying that writing systems in C protects you from mistakes in the JRE (which you have no control over) in exchange for exposing you to your own mistakes (which you can control). You can then spend a large amount of time and money developing and verifying your system. The goals and constraints of your project will determine whether this is a good trade-off. I'm certain that Java is preferable for writing the vast majority of web apps, but the web is not everything.
> If the tooling for verifying C is really so good, why not verify the C parts of the JRE?
First, I'm going to guess that an enormous amount of static and dynamic analysis has been done on the JRE. Bugs in it are rather rare these days, given its size and complexity.
However, verification tools are generally not suited to this particular task. Verification tools are better at verifying typical application code, and the JRE needs to do a lot of very unusual operations in order to work. In cases where you'd use verification, you'd also typically use a "safe" subset of C. Some of these subsets don't even permit dynamic memory allocation, or if so, only permit it at program startup.
So, it may actually be more straightforward to deliver a working Mars rover in C than it would be to deliver a verified JRE. Neither task is easy.
>> But C, holy shit, C? You'd write a program in C?
Most of the worlds software rides on the back of C. Pick a language, the runtime is probably written in C. Any high performance libraries are either in C or C++ (QT anyone?) and for numerics you may still find <gasp> Fortran. If performance is a requirement or running on a micro controller, you're going to have some C or you fail to get the best performance. That said, stuff like string processing sucks in C - some say that's why C++ was invented.
> Most of the worlds software rides on the back of C. Pick a language, the runtime is probably written in C. Any high performance libraries are either in C or C++ (QT anyone?) and for numerics you may still find <gasp> Fortran.
Maybe you can't eliminate C entirely. But you can avoid using anything in C that doesn't already have millions of users. Even then, I'd still be more worried about those small slivers of heavily tested C than about the rest of the code put together - working on my $100,000,000 project in Scala, the single biggest thing I'd be worried about would be hitting a bug in the JVM itself. The JVM is mostly Java these days, but there's some C/C++ in there and a disproportionate number of critical JVM bugs happen in the C/C++ code (who'd've thought?). And debugging those problems is Not Fun.
C is a hard requirement much less often than people seem to think. I've coded for microcontrollers in C, but I've also done so in Java. Better slow and reliable than fast and buggy.
By that logic, you should probably stop programming altogether because all runtime of managed languages is written in C. What if it all comes crashing down?!
I do worry about it. I probably wouldn't gamble $100,000,000 on any program, if I had that money to start with, because I could never be 100% confident in a system being bug-free. But I can minimize the surface area of C that I expose, and use better tools where possible (which is often). Given that I still do need to write programs (even C is more reliable than doing things by hand), what other option is there?
The last month of Fridays I've been tinkering with Idris, trying to get it to... well, do anything useful. I'd heard it was a dependently typed JVM language, which it is... kinda. Turns out you need to install a full Haskell toolchain to do anything with it. With the right esoteric option incantation, you can make it build an executable that turns out to be a shell script concatenated with a jar file. Nice. But you can't build a library with it, even within the library. You can't invoke it to build some classes to use later. You can't even call the build tool from anything remotely standard in JVM-land, and when I asked about making the compiler selfhosting the response was a kind of lukewarm "yeah, sometime". The current toolchain is in Haskell but there's not even an FFI from Idris into Haskell, so porting it would be... challenging. I tried to join the two together via their C FFI, but in between undocumented linker options and the fact that the Idris runtime is already linked to part of the Haskell runtime, I eventually gave up. If nothing else, it gave me a real appreciation for how much work went into making Scala a serious, commercially usable language, something I'd previously rather taken for granted.
But C, holy shit, C? You'd write a program in C? And expect it to work? I've seen programs go wrong in a lot of languages, but most of them can be eventually fixed. In C you get irreproducible voodoo random crashing that, sure, you can usually track down with a static analyzer, valgrind, debugging, intelligence and luck. But what if you couldn't? What if the program was just broken, and you couldn't fix it? With $100,000,000 on the line, there is no way on earth I would risk letting C (or any other unmanaged language) anywhere near my codebase. It might mean more work, and less features, if I were to use e.g. that pure-OCaml SSL library, or a JVM-native multimedia library. But I'd do it in a heartbeat, all the same. "The final say on overall data sizes" is such a tiny, trivial concern compared to using a language where failure is understandable, reliably diagnosable.
(In fact I'd say the scenarios where a measly factor of 2x memory consumption makes the difference between "working" and "not working" are just vanishingly narrow. If you need to scale horizontally in OCaml, you're going to need to scale horizontally in C a couple of weeks later. Particularly with $100,000,000 on the line, sod it, buy a bigger server with more RAM if using 32GB rather than 64GB really makes the all-important difference).
Would I use a "pet language" I tinker with, like Idris? No, but I wouldn't use that for any kind of serious commercial work. With $100,000,000 on the line, I'd use the same language I use for almost all my work: Scala. And I would stay the hell away from JNI, because I don't want C anywhere near this system, lest it bring it all crashing down.