Really, a safety critical hard realtime system with control loops on top of Pyth...

randomdata · on Nov 30, 2016

> these kinds of applications.

As in research projects, which this project expressly states that it is? What are those reasons? It seems like being able to test theories as quickly and cheaply as possible is the top priority. If Python, a general purpose OS, and a CAN-USB adapter is sufficient to conduct the research and satisfies those attributes, it seems like the perfectly logical way forward.

I understand why you would want to move to such systems you describe once your research has concluded and you are now building something for production, but that is not the intent of this project, at least at this stage. It specifically says so.

bdrool · on Dec 1, 2016

> As in research projects, which this project expressly states that it is?

This is from comma.ai, who famously tested on public roads with a reporter in the car and very little prior testing ("the first time it worked was this morning").[1]

[1] http://www.bloomberg.com/features/2015-george-hotz-self-driv...

ryaneager · on Nov 30, 2016

Why wouldn't you do the research on a platform that could actually be deployed? Build the core using the proper tools and techniques. Any bug in a system like this could be life threatening at worst. Wouldn't you want to use the research phase to help eliminate such bugs?

Unless you're not planning on ever releasing then a platform like this makes sense for strictly research, but Comma AI did plan on releasing their product until they got shut down.

randomdata · on Nov 30, 2016

Research implies that you don't know the outcome and are going to be throwing away everything that doesn't pan out. Without knowing the outcome, how can you really be sure of what you need? You might think you need a CPU-based system but learn that GPUs are necessary, for instance. If you can start with cheap off-the-shelf components, at least you don't have much to loose when it is time to throw it away. If you're spinning your own boards, or heavily tied to expensive development kits, that becomes much harder to swallow. Especially for a startup with limited resources.

> Wouldn't you want to use the research phase to help eliminate such bugs?

I wouldn't think so. Spending your time fixing bugs in something you realize could have been done better another way, which ends up getting thrown out, seems like a waste of time. The words research and development are often paired together because it is a two step process, where development comes after you have learned what can be done.

MichaelBurge · on Dec 1, 2016

As an example of some requirements, any recursion is banned unless accompanied by a termination proof that keeps track of stack size. No calls to malloc() or free() are allowed, and so any library you might use that allocates memory is banned.

It's far easier to get the code working in something like Python, and then rewrite it in safety-critical C later. And the kinds of people who can write self-driving cars in Python are using very different skills than the kinds who can write safety-critical C, so it might not even be the same person.

time4tea · on Dec 1, 2016

Hmm. Clearly this code is miles away from MISRA compliance (although compliance doesn't mean that the software is any good)

Even so, sorry to say, the code is miles away from any decent software standard. It is almost exactly what you might call CRAP (classes really a procedure) and mixes responsibilities all over the shop. Little attempt is made to abstract behaviour, and there are differing units, and random multipliers all over the place. Using c++ and python, both of which are OO many classes of error could be avoided if any actual OO features were used, even if MISRA was not a target, sadly the code doesn't do that.

This has all the hallmarks of software that doesnt know if it's doing m/s or mph.

JoshTriplett · on Dec 1, 2016

Exactly; the delta from Python to C is smaller than the delta from C to "safety-critical real-time C". Might as well prototype and simulate in a higher-level language that's obviously not the final one, rather than writing in a language someone might mistake for production and trying to incrementally evolve it into a full safety-critical version.

throwaway729 · on Nov 30, 2016

> once your research has concluded and you are now building something for production

I think the rub is that this code appears to be intended for "production" (or at least use on public roads), given the fact that it's published by a company that has tested its products on open roads?

56245623456 · on Nov 30, 2016

It explicitly states: "THIS IS ALPHA QUALITY SOFTWARE FOR RESEARCH PURPOSES ONLY. THIS IS NOT A PRODUCT. YOU ARE RESPONSIBLE FOR COMPLYING WITH LOCAL LAWS AND REGULATIONS. NO WARRANTY EXPRESSED OR IMPLIED."

throwaway729 · on Nov 30, 2016

And yet comma ai were definitely testing their products on public roads, products which were presumably running this code.

mikeash · on Nov 30, 2016

This would certainly explain why they folded so quickly once the NHTSA started asking questions.

throwaway729 · on Nov 30, 2016

Note: other comments in this thread indicate my assumption that comma ai was testing vehicles running this code was a bit hasty.

__jal · on Nov 30, 2016

...Which is why it says "YOU ARE RESPONSIBLE FOR COMPLYING WITH LOCAL LAWS AND REGULATIONS".

jakevn · on Dec 1, 2016

Ah yes, laws are definitely the determinate for safety. This is why texting and driving is illegal everywhere.

vacri · on Nov 30, 2016

Does anything not have that warning (or similar) on it?

stevendhansen · on Nov 30, 2016

The big problem in my mind is that Python is the wrong tool for the job. It is 2016, there are more hard-RTOS prototyping platforms than ever before. My favorite is National Instrument's RIO platform, which lets you use C or LabVIEW (imho the best language for prototyping control algorithms by far). Mathworks also has a platform based on Matlab/Simulink, and the list goes on.

Why use Python when there are existing tools that are made for this type of application?

snovv_crash · on Nov 30, 2016

LabVIEW is great until you need a complicated data structure. I tried writing a tree to do kNN, and it turned out that doing a brute force search was faster even with 100k elements.

veli_joza · on Dec 1, 2016

This is (now) open source project expecting community pull requests. How many would they get if they went with LabView? The big emphasis that I like seeing in modern scientists is reproducibility of results. Jupyter Notebook is a huge step in that direction (not really applicable here), but just using open source platform is still great.

I agree though that once algorithms are developed sufficiently they should be ported to RTOS platform and this box shouldn't be permitted on open roads.

mrlambchop · on Nov 30, 2016

It seems incredible that this is the source code for the $999 product originally destined to be available this year.

I'm extremely familiar with OBD|CAN and car networks - the embedded code alone is an essay in how trivial something can appear (sending commands to a ECU) without considering the million edge cases that make this a safe product to use.

espes · on Nov 30, 2016

> It seems incredible that this is the source code for the $999 product originally destined to be available this year.

It's not

mrlambchop · on Dec 1, 2016

Forgive me but it seemed like this was the case. Is this an earlier version of the code or a special opensource clean up then?

http://newatlas.com/geohot-comma-ai-openpilot-open-source/46...

CmdrSprinkles · on Dec 1, 2016

The ars technical article covers it better http://arstechnica.com/cars/2016/11/after-mothballing-comma-...

But in a nutshell:

The company was approached by agencies that had severe concerns over safety and if this would have the required regulations in place.

So the company folded and released this instead.

hasbroslasher · on Nov 30, 2016

I'm not really following. If it works then why not use a control loop on top of Python? We are engineers, after all.

throwaway729 · on Nov 30, 2016

Because by "works" you mean "I tried it out a bunch of times and nothing bad happened, so must be production ready", because that's what you do when you build websites and desktop CRUD apps and nobody ever died cause the web server choked under load or the web page rendered a bit funny or the request took a full quarter second because the GC kicked off as a wave of requests came in.

And then one day a one-in-ten-million event that never would've surfaced during testing does happen and the OS crashes or the interpreter hangs or some non-determinism causes a period of non-responsiveness or a bit gets mangled because you don't have any redundancies and solar rays are thing or ... and someone dies. And of course that didn't happen in testing, because all of those things are rare possibilities.

And then it happens again.

And again.

And then you're in court being sued for millions. And a bunch of industry experts who do build their systems with redundancies and do design their systems with a safety-first mindset come to the stand and rip you apart for not taking even the most basic precautions. And when it comes out just how many best practices you completely ignored, you're mostly just hoping beyond hope that all of the legal problems stay on the civil side of the civil/criminal divide.

Or, to say it in a sentence, "because the sort of exceptional circumstances that safety-critical software needs to handle with grace are very difficult or impossible to account for with a desktop machine running interpreted code on top of Linux."

hasbroslasher · on Nov 30, 2016

I like that you reduced that to a one-liner, and I understand your point. I sure as hell wouldn't hook my car up to this software.

But I also feel like you're being a bit hasty. Obviously a Python script isn't going to turn into a Tesla overnight. But maybe it'll help you find a few bugs before you throw all that time and effort into building the real deal. When I look at a Github repo that claims to be a self-driving car I don't say to myself "yep, looks production ready," I say "Cool prototype, now let's break it." To me it seems entirely reasonable to get a working Python prototype 90% of the way there and then send it off to the OS programmers to design something that actually meets the concept of "production ready".

throwaway729 · on Nov 30, 2016

Yes, I absolutely agree with you on that. Nothing wrong with prototyping in whatever setting is most convenient.

I think w/ self-driving cars there is an interesting ethical question. The full auto cars on the roads today definitely aren't production ready, but they also have constant safety drivers. Probably even ACC systems are tested in the wild with a safety driver before production.

Basically, "is the safety driver sufficient to justify running prototype software in the real world?"

hasbroslasher · on Dec 1, 2016

Yeah, that's a tough question that we as a society will have to wrestle with. Knowing that humans are imperfect drivers as well doesn't make it any easier. Even cars today have crippling safety flaws -- remember that one Lexus model that had a sticky gas pedal? Or worse yet, the Ford Pinto. I'm genuinely interested to see how the governments of the world weigh in on this, if at all.

Intuitively, though, I think that buying/using the software is tantamount to accepting its imperfections, so long as they are adequately (factually) presented to you beforehand. You're signing off your ability to make your own decisions, but are still responsible for them.

sapphireblue · on Dec 1, 2016

You have a lot of solid points, but note that Linux is currently being used by SpaceX in an even more safety-critical aerospace setting.

Also note that interpreters have their place in safety-critical aerospace as well: some satellites run Forth.

sangnoir · on Dec 1, 2016

> You have a lot of solid points, but note that Linux is currently being used by SpaceX in an even more safety-critical aerospace setting

I disagree with the 'more' - how many lives are at risk with a SpaceX failure vs. a self-driving car failure? This is even without multiplying by number of users.

Kliment · on Dec 1, 2016

There are few things more destructive than a serious spacecraft crash.

nightski · on Dec 1, 2016

You are really going to freak out when you learn how deep neural networks work.

throwaway729 · on Dec 1, 2016

I know how DNNs work -- I've even designed my own.

The existence of a component that is difficult to analyze for safety doesn't justify ignoring well-established safety engineering techniques throughout the rest of the system. That attitude would have us throw out seat belts and snow tires just because the ACC system might be buggy sometimes.

pjc50 · on Nov 30, 2016

The thing that distinguishes an engineer from a bodger is not just observing what works but understanding why, how, and to what margin of safety it works.

The reason the "realtime" designation exists at all is that, in a realtime system, it is possible to say what the worst-case timing is for any operation, and to guarantee by design what failure conditions can and cannot happen.

You cannot guarantee in a python program the timing of your GC. You can't even guarantee that the whole thing won't fall over with an exception.

And we've already had this problem with the Toyota "unintended acceleration" bug, in the (far simpler) electronic throttle system: https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_... - and that was using C, but not in an appropriate style (MISRA or similar).

(89 dead, cost Toyota billions)

drvdevd · on Nov 30, 2016

Believe it or not - many folks coding out there don't even understand the meaning "real time" in this designation. I recall asking some colleagues in a conversation, "So when you say 'real time' do you mean hard or soft?" To which they replied, "Just, real time, like in real time." I gave up pressing the point when I realized they simply meant "interactive" and didn't know such a designation existed.

anonred · on Nov 30, 2016

Hard real time is really only applicable when designing critical systems. Unless your anecdotal conversation was about one of those systems, it's pedantic to not simply infer soft/firm real time.

drvdevd · on Nov 30, 2016

I agree. And yes though that is inferrable from my anecdote, in fact I wasn't being pedantic. I actually probably used the terminology wrong myself, but all I really meant was, "what are your latency requirements in this system? How real time?" It was a conversational confusion because they kept using the term and I thought they were using it technically. When I realized they weren't, I didn't bother trying to correct them...

[edit] I'll add this conversation stuck out in my mind simply because I was a bit surprised that no one knew what I was talking about, that's all.

wedesoft · on Dec 3, 2016

Python uses reference counting. The garbage collector is only required for cyclic references and can be disabled.

agumonkey · on Dec 1, 2016

The C code was a byproduct of some higher end framework isn't it ? (Still doesn't forgive the stupid errors that went into but..)

gumby · on Dec 1, 2016

This comment shield should not have been downvoted. Downvoting should be reserved for off topic comments. Someone who is confused and brave enough to ask a question should not be discouraged through downvoting.

The question is naive and has a dangerously ignorant view of what "engineering" means. But it has been properly and usefully answered by others.

(sorry for a meta comment rather than a substantive response, but I feel strongly about this).

commaai · on Nov 30, 2016

This is research code, not product code. I wouldn't recommend shipping a consumer product with your PID loops in Python.

jjn2009 · on Nov 30, 2016

you could easily starve a process or thread whose purpose is to stop your car in an emergency over a long enough period of time to cause the occupants harm. I haven't dug into the code so I don't know exactly how this software uses processes and threads (you could have one process no threads and crank down nice on the process) but considering this and GC in python I would not feel confident that this system will always respond quickly enough to emergency scenarios.

ma2rten · on Dec 1, 2016

The Python control loop is probably the safest part of this system.

The reason that comma.ai was able to get a working prototype quickly (unlike much bigger and well-funded companies) is that they use an end-to-end deep learning model. The model directly learns to issue commands to the car based on raw image data. The problem with this approach is that neural network models are a blackbox. They can behave in completely unpredictable ways.

Animats · on Dec 1, 2016

Actually, they didn't. At least not in this code.

KKKKkkkk1 · on Dec 1, 2016

Human drivers don't run an RTOS and a deterministic bus system either. In fact, I would argue that very rarely do human drivers fail because of jitter or nondeterminism.

tptacek · on Dec 1, 2016

This is a weird argument. Human drivers don't deadlock or crash either. Are you sure you understand the criticism being made here? Hard realtime systems make guarantees about the frequency with which code will run on a shared system.

veli_joza · on Dec 1, 2016

I agree that this is not production ready tech, but still parent does make a good point. I think those two similar comments are downvoted just because people here don't agree with them.

Suppose you make an autopilot in JS running on Electron on WinXP, running on VM on top of Puppy Linux Live CD. And you still manage to prove that your system is 100x more reliable than human driver. Should it be dismissed just because we don't agree with technology stack? The latency of this monstrosity would still be lower than human driver and maybe they would stay lower throughout the operation.

terryf · on Dec 1, 2016

This doesn't have anything to do with latency really. It's about predictability. Given the system you described, there are so many edge cases that can potentially compound, that it is impossible to make any predictability guarantees about the system.

As an example, I worked on an embedded system that controlled electrical motors that was hard real time. The fast task time interval was once per millisecond. No matter what, that got called by the RTOS exactly 1000 times per second. When it didn't finish it's job in time, the result could easily wreck real world items or cause harm to people. Nobody even considered using interpreted languages in that project. The fast tasks all had provably run in much less than a millisecond. That means no loops that could be unbounded, no memory allocations, no recursion, no writing to flash, anything that was even slightly unpredictable was out.

So, even if you could prove that your system caused less accidents than a human driver when it was running well, it would be impossible to do an analysis that defined under what circumstances the system would be running well. Given that, it would not be allowed in a well-engineered real time or safety critical system.

jeffbush · on Dec 1, 2016

"And you still manage to prove that your system is 100x more reliable than human driver."

That's the rub: how do you prove that? If your software stack is 30 million lines of code that was written by god knows who, I would argue it's nigh impossible without releasing it and seeing what happens, which seems morally irresponsible and legally negligent. If you follow strict rules in coding conventions and algorithms, it's easier to statically verify code is probably correct.

savanaly · on Dec 1, 2016

The more relevant question is not do they but if they could would we want them to?

breck · on Nov 30, 2016

Why not?

We entrust humans with operating cars right now. Humans in a sense are running a general purpose OS.

dsr_ · on Nov 30, 2016

If you read enough security literature, eventually you'll come across the notion of a "trusted system". And your first impression will be that this is a system which is fully debugged and tested and you can trust it to do its job within specifications.

But you couldn't be more wrong. A trusted system is one which you have to trust. No representations are made about whether you should or not.

Humans are frequently trusted systems.

andars · on Nov 30, 2016

This is not relevant. If a corporation builds a product that kills people, the corporation is responsible. No corporation is responsible for the behaviour of humans.

tptacek · on Dec 1, 2016

No hundred humans are running the exact same OS that can deadlock or crash and as a result drive off the road, which is why this trope about humans being worse than computers at driving is entirely irrelevant to a thread about why it's unwise to try to build self-driving cars on top of Python scripts.

breck · on Dec 5, 2016

My take of the OP's point was that the deep general stack was unnecessary and decreased reliability. My read was that they think your self driving car software shouldn't also be able to run arbitrary python scripts. My argument is that it's practically irrelevant to reliability. I would not be surprised if in the future you could download an app to your iPhone, connect your phone to your car, and have the app drive your car. Sure, maybe having a specialized real time o.s. without any higher level interpreters could be more reliable in the early days, but the gains from rapid iteration you'd get from allowing cars to be driven by apps running on higher level abstractions would be worth the trade off, and over time the latter could be iterated upon to achieve practically the same reliability as over-engineered systems.