Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Compiling Typed Python (bernsteinbear.com)
108 points by tekknolagi on June 20, 2023 | hide | past | favorite | 50 comments


I would like to note in this code example:

    class C(int):
        def __add__(self, other):
            return "no"  # sigh
->

    class C(int):
        def __add__(self, other: int) -> str:
            return "no"  # sigh
Proper type annotations and using strict mypy type checking will throw an error.

    test.py:2: error: Return type "str" of "__add__" incompatible with return type "int" in supertype "int"  [override]
This is again brought up later in the article.

I also thought it was odd to diverge into attempts to compile (against?) non-typed code. That seems at odds with the premise of the article.

This is still a really good article. Just a little meandering.


The idea is that not all code in your code will be typed. So while at the typed callee everything is all hunky-dory, any random subclass might not be typed. The subclass could even be a C extension.

EDIT: Also, the point is that even if it's well-typed, it can still go and execute arbitrary code. So you can't just specialize the callee.

EDIT: Pushed and thanked you in the commit.


> The idea is that not all code in your code will be typed.

All Python code is always typed; it just might not be annotated. Even then, static analysis tools like mypy are pretty good at inferring types based on the code.


I like how with python you need to put quotes around "pseudocode" because the pseudocode is actually functional.


Python has been described as executable pseudocode for many years now.


If you look the example code you will also immediately understand why Python is “slow”. Also for some easy performance gains Python should forfeit some of this dynamism and optionality and just give types like int (no subclassing).


This is the topic of the post if you keep reading :)


What would be nice is an official '@compile' decorator, but there already are a couple of attempts and none of them are (AFAIK) compelling enough to declare them the default.


I think the most compelling effort right now is Mojo [0]. Chris Lattner usually delivers, so I'm pretty bullish on Mojo.

[0] https://news.ycombinator.com/item?id=35809658


Unfortunately subclassing int is actually used https://docs.python.org/3/library/enum.html#enum.IntEnum and worse Python's int is arbitrary precision so you couldn't compile it to a fixed size.


Well, there is always the option of C/C++...


I like to say that it would be nice if python had as much money invested as javascript.


What prevents someone from “just” making a sort-of JIT compiler by converting Python to JavaScript at runtime and then running the generated JS through V8? Then you could reuse a lot of the optimizations in these JS engines, and the interface between them could just be a thin translation layer between the two languages.


Probably the same problem that prevents a JVM based Python from being popular. There are many popular libraries that use C extensions. Without those the platform isn't usable for many use cases, and it just becomes a niche platform.


The optimisations are around the javascript types, such as the number (float) and the object (which has string properties, a prototype chain, etc). Once you have to express all the Python semantics, like arbitrary precision integers or multiple inheritance, the optimisations no longer "skip" much of the work.


I'm 1000% with you, I never understood why GAFAM never invested much in python despite using it quite heavily. Maybe more NIH and politics than engineering issues I guess?


what is GAFAM? Is that suppose to be the new FAANG acronym?

In which case, they actually have, on multiple occasions, including currently Microsoft being one of the biggest contributors to Python core.

At one point Google employed Guido and he worked on Python there, and I believe they also have made alot of contributions to core.

Can't speak as much to Meta or Netflix though, simply don't have the knowledge

Its an architectural problem, more so than a money one


This is not accurate. Most large contributions up to at least 3.6 have come from individuals, many of whom left or went silent because they could not stand Python core or its development practices.

GvR has contributed very little while at Google.

Even with asyncio most contributions came a) from independent developers who fixed all the mistakes and then b) another non-FAANG developer who augmented it substantially.

FAANG provides power hungry politicians who go for every title and position of power in Python core and the PSF.

The Python speedup project is the first of its kind and so far is based on decades long independent work of Mark Shannon.

I doubt than any JIT expert at Microsoft would want to get involved with PSF politics (career risk), especially because F# and C# are better languages.

(The no-gil project of Sam Gross is the second of its kind, but is unlikely to go in because NIH politics.)


> what is GAFAM? Is that suppose to be the new FAANG acronym?

It's the French version of the acronym


GÁFÀM


more AAMAM (not to be confused with AMRAAM or Hammam)


Many of the projects the blog post talked about are spearheaded by Meta (StaticPython, Cinder) and Microsoft (optimizations in Python 3.11+)


Or as in Java in its early days, by Sun.


I hate when people say "python is a simple language" when having this opinion requires ignoring everything about how python is designed (more like patched together), and requires using some non-standard definition of "simple".


I think the non-standard definition of simple is the wrong way of looking at it. people 's thought about what is simple and what is complex is different from person to person. C. is a simple language from a "what primitives do you have to work with?" perspective, however, to knock on effects from those primitives, then the interaction of that code with other code is far from simple.

in the same case, python prevents a simple interface to do a large number of fairly common tasks, parsing a structured text file, input and output into various data formats, and the tools to build upon your work as it gets more complex.

I'm curious to know what your definition of simple is?


In the context of programming language design, a lisp like Clojure would be simple if you ignored all the java things. Haskell is a simple-ish language (despite popular misconceptions).

For me, “simple” needs to be distinguished from “familiar”. Most of the time, people think whatever is familiar is simple, but that just makes the word “simple” less useful. The von Neumann style of programming introduces tremendous complexity.

I would probably agree that C is simple in design. I’m not sure, because I’m not extremely proficient in it, but I think I agree. Although it has tons of complexities in practical use… So perhaps it’s easier to speak of (and compare) the simplicity of higher-level languages.


> in the same case, python prevents a simple interface to do a large number of fairly common tasks

I assume this should read "presents* a simple interface..."?


Yes, I was using speech to text and did not proof read closely enough.


It really was a very simple, well put together language. I think the goal of simplicity has been abandoned. The := operator, introduction of declared types, the way that generators became extremely complex to make them work as microthreads, and the division of python codebases between async code and regular code, which aren't compatible with each other, are proof of this.


> It really was a very simple, well put together language

This is the very premise that I question. Simple in comparison to what? Javascript? I'd actually put python and javascript more or less in the same ballpark of complexity (neither of these are simple languages).

Look carefully at the design of something like Clojure (which has constraints, eg.: around java interop, of a sort that python did not have), and it's hard to unsee all of core python's accidental complexity.


OP you mentioned mojo, have you tried it yet? Jeremy from fast ai (one of their advisors) overhyped it so much I’m taking their claims with a grain of salt for now.


People should really avoid the hype in most cases, mojo is basically its own language right now despite its "python superset" marketing bs. It's vaguely like python except for: no classes, no list/dict comprehensions, no lamba functions, tuples are messed up, no global variables, variable capturing scopes are not what you'd expect, no yield/generators, no async for/async with, etc. There's a long list of common python features it doesn't support. In addition, it's closed source and "sign up for access". This thing is the next V in my opinion.

https://docs.modular.com/mojo/roadmap.html


That sounds like an incomplete Python, not a language that looks vaguely like Python. Is it not conceivable that it will eventually have feature parity with the same syntax?


It may eventually have similar features with the same syntax; but the semantics will be slightly different (because the original Python semantics just aren't suitable for a compiled language). You won't be able to take existing Python code and just compile it as Mojo.


That was all covered in this podcast. I think they're working on it for the next year or so.

https://youtu.be/pdJQ8iVTwj8


It's hard to take it seriously right now in the sandbox. I think what they're proposing is entirely possible --from a gradual typing standpoint -- and can't comment on any of the ML infra stuff. But I want to play with it and be able to read the source at the same time.


After hyping Swift for Tensorflow, which is yet another reason for the grain of salt.


Interesting that you mention TorchScript, but not TorchDynamo+TorchInductor from PT2.


Not intentional. I'm not familiar with the latter two. I'll take a look- thanks


Yeah, no malice or any such thing assumed :P

Let me know if you have questions!


Well your bio indicates I shouldn't ask ;) but how would you differentiate them? I'm very unfamiliar with the ML tooling space


Emailed


It's funny I was thinking aren't both these guys at FB - couldn't they just chat over WP.

But I will say that despite your enthusiasm for inductor/dynamo, I think it's an extremely poor substitute for TS where whole graph compilation is concerned. But hey I get it jansel has spoken and the bright blue future is compiling one stack frame at a time and the graph breaks will continue until morale has improved ;)


Ah, I left FB in February. Now I am in grad school


congrats! enjoy :)


Thanks! We'll see what weird things it brings


> where whole graph compilation is concerned.

Just put nopython=True on it?

I don't think snark from anon accounts are cute btw. Identity or politeness or gtfo.


>Just put nopython=True on it?

umm that just percolates up the errors? you're still not guaranteed to successfully get a whole-program graph (only that if you get any graph at all then it is whole-program).

>I don't think snark from anon accounts are cute btw. Identity or politeness or gtfo.

the lady doth protest a little too much, me thinks - i was just making a joke about the transition being top-down rather than bottom-up. also it's not like you have your home address listed here so not sure i should post mine either.


> the lady doth protest a little too much, me thinks

You quoted it wrong

> also it's not like you have your home address listed here so not sure i should post mine either.

tisk tisk you and I both know thats not what I mean


I immediately did a whois search on the domain, unfortunately couldn't prove it came from the Berenstein bears universe




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: