Nuitka: a Python compiler

misaelm · on Dec 19, 2014

I built a small app that retrieves data from a remote server, builds a summary and then pushes it to a local html file.

Number of people that tried it when I told them that to make it work they had to download python, pip, paramiko and install the pycrypto binary? Zero. Number of people that tried it after I just gave them a zip file with an executable? No longer zero.

I tried different libraries but Nuitka was the only one that made everything work seamlessly. I owe this guy a beer.

Edit: I no longer owe Kay a beer. I found the donation link on his website.

fernly · on Dec 20, 2014

This issue has been addressed multiple times with considerable success by pyinstaller, py2app, cxfreeze. You can get your single-zip-file distribution package without nuitka, although it may have other advantages.

aidenn0 · on Dec 19, 2014

I recently ended up rewriting large parts of a python project in common-lisp due to this exact issue; in this case it was one stage of a pipeline that I prototyped with an existing python library that could output to xml, but delivery of python applications on windows is painful.

sanderjd · on Dec 20, 2014

I think this is one of the reasons Go was a breath of fresh air to a lot of Python and Ruby programmers. The ability to quickly generate a static binary is a great feature.

president · on Dec 19, 2014

If you packaged it correctly, you could get by just requiring Python and pip to be installed. But still, the zip and single executable is still faster.

ntrepid8 · on Dec 19, 2014

This project looks completely misguided. The talk focused on the trivialities of mapping Python to C++ rather than on the interesting problems to be encountered when trying to optimize Python while maintaining its extremely dynamic semantics. Also the benchmarking effort is laughable; pystone is not to be taken seriously (only exercises a tiny part of the language) and pybench does microbenchmarks, which are optimized away. You should try the "real-world" benchmarks from the PyPy and Unladen Swallow projects. And what is the size of the generated code? (E.g. how big would the binary for the entire standard library be?) In your blog, please use less boring subjects than "version x.y.z released". ~ Guido van Rossum

Find Guido's quote in the first comment.

https://ep2013.europython.eu/conference/talks/nuitka-the-pyt...

fmoralesc · on Dec 19, 2014

Well, that is harsh and somewhat unwarranted ("In your blog, please use..." Who gave him that authority?). I like Hayden's comment on this all:

> The good thing [GvR] did to Python is to make his opinion be just that. I can do Nuitka without and beyond his control.[1]

[1]: https://ep2013.europython.eu/conference/talks/nuitka-the-pyt...

jnbiche · on Dec 19, 2014

If you posted that comment to somehow discredit this project, then I'd say you miscalculated (if you had other motives, then thanks for posting it, I guess?). Most of the folks on HN know better than to accept the word of an authority figure -- particularly one so harshly worded -- and so will check out Nuitka on their own.

Regardless of whether or not Hayden succeeds here, good on him for at least trying. GVR hasn't even made an effort to improve Python's performance (i.e, by most accounts, Python 3 is even slower than Python 2). In fact, he seems intent on actively discouraging such efforts. For example, why is CPython still using a stack-based interpreter, when several people have already worked toward implementing a register-based one and were only met with derision?

I generally agree with GVR's general approach of regarding premature optimization as a mistake, and developer time is usually more important to optimize for than processor time.

On the other hand, sometimes you have to optimize code, and dropping down to C is unnecessarily risky (not to mention tedious, although that comes with the territory). The fact that there's even an alternative today with Numpy and Cython seems to have happened in spite of GVR, not because of him.

I'm sad that someone as influential as GVR would be so consistently rude and dismissive toward an effort at improving Python, no matter how misguided he viewed the effort. This is the kind of attitude that makes leading open source projects suck.

Edit: By the way, I don't mean to imply that I share the pessimism some have about Python's future. Between Python's considerable rate of adoption in upper education, numerics, machine learning, bioinformatics, and various other scientific computing fields, combined with novel approaches to the language like PyPy, Pyston from Dropbox, Nuitka, and Numba, I'm overall pretty optimistic. But it's becoming clear that the committee-driven approach of CPython, driven/impeded by the BDFL, isn't bearing the fruits it has in the past.

jarcane · on Dec 19, 2014

It's sad to me to compare my sense of the Python community ten years ago, to what it is now. It seems like there was a time in the old days when Python was meant to be fun (and even the documentation matched that attitude), whereas between version politics, it's wider adoption for 'serious' work, and GvR's hostile attitude these days it really doesn't seem like that spark is even there anymore.

I miss the Monty Python jokes and the freewheeling 'BASIC, evolved' feel of old 1.x sometimes even.

ntrepid8 · on Dec 19, 2014

I wasn't trying to discredit the project. I think the project is neat and I wanted to see what other people on HN would say about GvR's comment.

sauere · on Dec 19, 2014

For me, this is not about execution speed AT ALL. I wouldnt even mind if it was slower.

It is about being able to have a easy, dead-simple way to provide Windows users with a executable. Without changing my code, without weird build systems. Like it or not, Windows users are still the majority out there.

_delirium · on Dec 19, 2014

I like the Lisp world's term "delivery" for this part, where you package up an application to ship to end-users. Optimization can be one part of delivery, but not necessarily the most important part.

TazeTSchnitzel · on Dec 19, 2014

Is that not py2exe's job?

AnkhMorporkian · on Dec 19, 2014

In theory. In reality, python freezers have a lot of problems. It can be a real hassle to get a single file, and even if you do they turn out massive. I've had nothing but problems with them in the past.

Semiapies · on Dec 19, 2014

I've had good experiences with PyInstaller, if you haven't tried it yet (or recently).

bcj · on Dec 20, 2014

Seconded. I recently moved a work project from py2exe to pyInstaller and was very happy with the results.

kevin_thibedeau · on Dec 19, 2014

I'd be happy if the setuptools bdist installer could create a single file that works in 32 and 64-bit Windows and would take care of running 2to3 during the install. The situation now is too painful for me to bother making Windows installers any more.

Cyph0n · on Dec 19, 2014

Exactly. I recently wanted to build a task tracking app for the company I work at. I initially decided on writing it in Python as a server-based app (Flask + SQLAlchemy) since I am familiar with it. Once I found out how damn difficult it is for end users to actually deploy the app, I opted for node-webkit (Backbone.js + nedb) instead. I definitely have no regrets.

agumonkey · on Dec 19, 2014

Similar to Go IIRC, the binaries aren't slim, but it's one push away from deployment and was advertised as a feature.

sagargv · on Dec 19, 2014

I agree. In fact, I think that the slowness of Python implementations is a feature. It forces developers to use standard libraries, which in turn makes the program more concise. This is certainly the case in MATLAB where vectorization is pretty much needed for non-trivial programs, and this leads to improved readability.

ludamad · on Dec 19, 2014

Slowness may have unintended benefits, but also forces you to use C code where you otherwise might have been OK

peterfirefly · on Dec 20, 2014

Which makes it hard to upgrade the language .. especially if you change the C interface at the same time as you change the language.

Animats · on Dec 19, 2014

For implementing Python, you have at least five options:

- Naive interpreter (CPython). Everything is a dict. Slow. - Transliterate to some hard-compiled language, but all data is still one kind of dynamically typed object (Nuitka). A little faster, and compatible, but has limited optimization potential. - Infer types and try to create appropriate code in a faster language (Shed Skin). Hard to do, but promising. (Shed Skin has one implementor.) - Restrict the language (RPython) Potentially much faster, but incompatible. - Build a JIT compiler/interpreter combo and handle all the hard cases that require recompiling during execution (PyPy). Hard to do, and results in a huge system, but almost compatible. After 10 years of work, it's finally happening.

If you're willing to restrict the language, it's much easier. RPython was written only to help build PyPy, but the concept could be extended to allow most of Python. Both Shed Skin and RPython insist that type inference succeed at disambiguating types. If you're willing to accept using an "any" type when type inference fails, you can handle more of the language.

The big boat-anchor feature of Python is "setattr", combined with the ability to examine and change most of the program and its objects. This isn't just reflection; it's insinuation; you can get at things which should be private to other objects or threads and mess with them. By string name, no less. This invalidates almost all compiler optimizations. It's not a particularly useful feature. It just happens to be easy to implement given CPython's internal dictionary-based model. If "setattr" were limited to a class of objects derived from a "dynamic object" class, most of the real use cases (like HTML/XML parsers where the tags appear as Python attributes) would still work, while the rest of the code could be compiled with hard offsets for object fields.

The other big problem with Python is that its threading model is no better than C's. Everything is implicitly shared. To make this work, the infamous Global Interpreter Lock is needed. Some implementations split this into many locks, but because the language has no idea of which thread owns what, there's lots of unnecessary locking.

Python is a very pleasant language in which to program. If it ever gets rid of the less useful parts of the "extremely dynamic semantics", sometimes called the Guido von Rossum Memorial Boat Anchor, it could be much more widely useful.

halflings · on Dec 19, 2014

> The big boat-anchor feature of Python is "setattr", combined with the ability to examine and change most of the program and its objects. [...] If "setattr" were limited to a class of objects derived from a "dynamic object" class, most of the real use cases [...] would still work, while the rest of the code could be compiled with hard offsets for object fields.

Isn't __slots__ made for that specific usecase? (when you want to optimize your code by specifiying specific attribute names)

I think the "dynamic" behavior is a sane default (since most people don't need that optimization).

comex · on Dec 19, 2014

setattr and similar dynamic features do make Python harder to optimize, but they're not that different from what you see in JavaScript. JavaScript has had an incredible amount of work spent optimizing it, but the result is a bunch of pretty damn fast language JITs that implement the full language, usually only slowing down in cases where actual use of those features requires it. Is it really that hard to come up with something like that for Python?

(Threading is a separate issue, though.)

ayrx · on Dec 20, 2014

> Is it really that hard to come up with something like that for Python?

Nope, we just need a large company willing to spend tons of money funding the effort.

PyPy has made incredible strides in this area, especially for long running processes where the JIT has time to warm up. But they need a lot more funding if people ever want Python to get fast.

Animats · on Dec 20, 2014

I'm very impressed that PyPy got their JIT working. But it took 10 years from initial funding by the European Union. It's a hard problem. They had to come up with some new, elegant solutions to make it work. See "https://pypy.readthedocs.org/en/release-2.3.x/jit/pyjitpl5.h...

JavaScript is a bit easier because it doesn't have concurrency. In Python, you can change a method of an object while an instance of that object is being executed in another thread. So you have to worry about invalidating code currently being executed asynchronously.

coldtea · on Dec 19, 2014

Guido's response, which is entirely unprovoked and rude (given that it is for a small volunteer effort, that has already achieved something admirable, doesn't ask anything from him, and doesn't harm his CPython in any way), seems to me worse than anything I've read from Linus (who's just a dog that barks but doesn't bite, and just uses the insults for emphasis).

This is pure condescending tone...

ori_b · on Dec 19, 2014

Looks like the response I would expect from someone who has seen tons of "Why not just compile it and see it get magically faster?!?!" queries.

It's easy to compile it, but making the compilation actually useful for a dynamic language is HARD. There's a reason that most dynamic languages will either interpret or JIT, and it's not because the JIT writers overlooked something obvious.

A naive translation of the code will remove a little bit of overhead from the bytecode dispatch, but the resulting code bloat will blow out instruction caches for any reasonable sized code. In small programs with one translation unit, analysis can sometimes work to speed things up, but it quickly becomes either undecidable or intractable.

Basically, on dynamic languages, you can often see static compilation being worse than a naive attempt at native code, especially once the hot path for the interpreter fits in cache, but the generated native code no longer does. Wanting to see results on real world benchmarks is entirely reasonable.

coldtea · on Dec 19, 2014

I don't know. Lua (the non JIT version) runs circles around Python -- without compilation.

And when adding compilation into the game, I don't see CL as any less dynamic (probably more) than Python, and yet it goes to near C speeds too.

reikonomusha · on Dec 20, 2014

CL goes near C speeds with an INCREDIBLE amount of work to make your program as static and C-like as possible. It also requires intimate knowledge of your particular implementation.

Sometimes it's not even possible (!) to get C speed because of things like float boxing across function boundaries.

coldtea · on Dec 20, 2014

I'll give you that -- but, while not C-speed, it still runs circles around Python when coding in a idiomatic style too.

pwr22 · on Dec 19, 2014

Looks like typical impulsive defensiveness, doesn't portray him well as a person

phkahler · on Dec 19, 2014

Seriously? I think Python is fantastic, and IMHO this is due to Guido. But seriously, how can he possibly comment on someone else's attempt to improve the performance of Python? He should look in a mirror when saying that ;-)

It's a hard problem, I wish more people would attempt to take care of it.

orf · on Dec 19, 2014

I like Guido but that comment seems too harsh and commenting on how boring his blog titles are is completely unnecessary. The talk was very interesting and is very neat for only a spare time project.

underachieve · on Dec 19, 2014

I don't know why (espacially key-) people in development (Torvalds, now Rossum etc.) seem to have a hard time phrasing their thoughts with a little more consideration. Would it hurt to put it more like "this project is not of interest to me" - what is completely misguided in working on something and trying things? Or to restrain oneself from bashing the effort as being "laughable" and instead try to offer ideas for improvement or presenting the alternatives without discrediting the whole thing. And if he is bored, why not just skip the whole thing?

danudey · on Dec 19, 2014

People in these positions (e.g. Torvalds) are generally extremely opinionated (which is good, because it can provide direction to a project in its early days), but have also spent years listening to people without sufficient skill or experience attempting to provide ideas, patches, or commentary that they're not qualified for.

At some point, the effort of letting someone down gently fifty times a day gives way to a curt but efficient form of communication.

In this case, GvR is just laying out what he sees as issues, take them or leave them. If you don't care for his opinion, fine, if you do, there you go.

chipsy · on Dec 20, 2014

W/R to language design, if you start designing without really caring about it, you probably won't finish. And that means that the designers of popular languages typically have unreasonable opinions. How that manifests itself depends a lot on the person, though.

Fede_V · on Dec 19, 2014

Wow, I'm really surprised. Guido is usually incredibly nice, but he comes across as very condescending in that comment.

I love Python, it's my favorite language by far - but I love all those different tools (numba, pypy, hope, nuitka, cython etc) that make you sacrifice a small amount of dynamic magic in exchange for significant speed ups.

I don't have to use them - but when I need to write fast code, it's really nice to be able to do so in Python.

nostrademons · on Dec 19, 2014

Are the names on the comments verified? This seems very much out-of-character for Guido, he's generally been quite supportive of efforts to improve Python's speed (eg. Unladen Swallow, PyPy) or add static typing as a library. It occurs to me that with this blog interface, I could post as "Ken Thompson" and nobody would be the wiser.

dalke · on Dec 19, 2014

I can verify that van Rossum's comments are by him. I saw him walk out of the aforementioned Nuitka presentation at EuroPython a few years back, and clearly out of annoyance with it.

See also https://groups.google.com/forum/#!topic/python-static-type-c... if you wish additional evidence.

That Google Groups thread shows that he's irritated because he thinks that the Nuitka author doesn't understand the issues: "... he is incredibly naive about what kind of optimizations he'd like to apply. (He basically doesn't seem aware of the difficulties arising with static analysis of Python.)" While the Unladen Swallow and PyPy developers do understand the issues.

I think it's in character. I gave a talk at another EuroPython on measuring performance. Partway through he asked "isn't this just a repeat of the timeit module"? My response was "yes, I'm explaining why it works the way it does." I think that was sufficient to mollify his irritation.

syllogism · on Dec 19, 2014

Stefan Behnel (one of the Cython leads) posted a more detailed critique of the project after the EuroPython talk too:

http://blog.behnel.de/posts/indexp241.html

Nuitka's author's reply here:

https://webcache.googleusercontent.com/search?q=cache:fB1tgJ...

I sympathise with the developer because I think the tone of the criticism he's been getting is very far short of nice.

...But...On a purely technical level...I see a lot of problems with his response.

I think there's real merit in this project on the distribution side of things --- not the optimisation side. I hope he pivots the project in that direction.

dalke · on Dec 20, 2014

That post by Behnel is a good find. FWIW, I was at the talk, and my memory was that I also agree that the speaker didn't know what he was talking about regarding optimizations and took incredibly too long to talk about it.

There are previous projects, like Michael Salib's "Starkiller: a static type inferencer and compiler for Python" ( http://dspace.mit.edu/handle/1721.1/16688 ), and John Aycock's "Converting Python Virtual Machine Code to C" (http://legacy.python.org/workshops/1998-11/proceedings/paper...) which explored that optimization space, and no doubt others.

So when the talk is titled "for the first time, there is a consequently executed approach to statically translate the full language extent of Python, with all its special cases, without introducing a new or reduced version of Python", a listener would expect that it be more advanced than previous work. To be fair, neither Aycok's nor Salib's work was complete, but they did enough groundwork to show that static analysis of the type that Nuitka was exploring would not be able to achieve its stated goals.

But all of this pointing to a discussion from 2012 is pretty pointless, as people do use it for distribution - something which wasn't touched on during the talk, as I recall - while the author did talk about aiming for type inference at compile time, when all evidence is that the speaker had little idea of the actual issues, as previous reported in multiple previous attempts.

I honestly don't know what to do when someone presents a talk as a pure hobbyist, puttering around on a project with little interest in what others have done, and who doesn't understand the audience enough to gauge which details are of interest and which aren't. There's a culture mismatch, certainly, but that's what makes it harder to assign fault.

Should Hayen have realized that the project wasn't at the right level for EuroPython, or that the future project goals were overstated? Did EuroPython get more advanced over time? Did van Rossum not allow that weekend hobbyists don't have the experience to judge things? Should van Rossum never say anything negative in public about a project (and if so, at what level of fame does one need to bear that in mind)? I certainly have no answers to those.

fmoralesc · on Dec 19, 2014

That is a very good point, and it would be a shame if Nuitka's developer (and people in this thread) would have gotten a wrong impression on GvR from something like that.

twelfthnight · on Dec 19, 2014

Apologies if this is ignorant, but how can we be sure that this was actually Guido van Rossum commenting?

silisili · on Dec 19, 2014

Because he basically said the same thing live in the crowd. He's vocally against this project.

twelfthnight · on Dec 19, 2014

Oh, okay. I can believe that. I was just curious.

vegabook · on Dec 19, 2014

So this would be the same Guido van Rossum who made a completely misguided analysis of the language he had birthed, tried to make it "grow up" in version 3.x, brow-beated us all about the dubious benefit of this "new" cruft-encrusted language, causing strategic drift for the entire ecosystem? The best thing that could happen to Python would be a benevolent, and mercifully silent, retirement of GvR. And I speak as a long term Pythonista.

Kudos to anybody who is trying to move Python into a performance envelope similar to Lua and Javascript, not to mention Golang, Julia, or Clojure.

jpgvm · on Dec 19, 2014

GvR fading away would help but not cure the problem he started, i.e Python design by committee. The Python culture results in substandard implementations of new features after waiting years for them to come to fruition. (asynccore/asyncio.. :( )

Of all the current dynamic languages Python is the slowest moving on almost every front. Ruby become popular around the time Python 3.X was coming out, at the time it was much slower and riddled with 1.8/1.9 issues. Since that time Ruby has surpassed Python almost everywhere whereas Python 3.X which should have had the freedom to do great things considering it broke backwards compat with 2.X has languished in it's slow performance.

Sure there is great things happening in and around the numpy community.. but that is tiny compared to the great things happening in Julia, Rust, Ruby, Golang and Javascript. (I include the last 2 despite my personal opinion being that they are not as good, but one can't deny they have made progress).

I loved Python but it's standing still and I can't afford to do that anymore. I said my very solemn goodbyes to programming Python full time a while ago and I think it's one of the best decisions I have made.

That being said. Good on this dude for doing something about Python performance without doing Cython style stuff.

/rant

rspeer · on Dec 19, 2014

...what's wrong with asyncio? Why the frowny face there? I've found it to be a joy to use, and the killer feature of 3.4.

vegabook · on Dec 19, 2014

Clarification. It's the only killer feature of the entire 3.X release lineage. And it's late (hence, if I understand correctly, the growly).

agentultra · on Dec 19, 2014

It's not the only killer feature. Where Python 2 lacks consistency Python 3 fixes it. Where Python 2 core libs have stagnated, the Python 3 ones have been improved. All future development from core CPython developers is going into Python 3. That's a pretty killer feature.

vegabook · on Dec 19, 2014

Slow, incremental improvement is not a killer feature. Compiled Python 10x faster. That would be killer. Proper multi-core concurrency. Killer. Modest (yet breaking) improvement over 6 years while all the action goes on in other languages? Not killer.

But you cite a very important point: all the improvement, modest as it is for the majority of users, is going to 3.x and yes, that's why I've moved. But I can tell you it's only because of the constant nagging and threats about abandoning 2.x. Nobody would have moved for any actual "feature" of 3 were it not for the fear of being abandoned. It's a stick-only strategy. No carrot.

detaro · on Dec 19, 2014

In my book, these are only killer features compared to Python 2, not in comparison to the rest of the world, and more along the lines of minimum necessary to justify the pains of a breaking change.

gamesbrainiac · on Dec 20, 2014

How is a library like asyncio a killer feature in 3, when you have twisted in version 2?

rspeer · on Dec 21, 2014

asyncio has support from the syntax of Python, a nicer API, and better documentation.

vegabook · on Dec 19, 2014

Agreed on all points though I think you underestimate to what extent Numpy is "carrying" Python. Without Numpy Python would be dead and buried in my view. Sure tons of stuff is "pure" Python but the core of its cred comes from the huge quality of some libraries which are built around Numpy (Pandas but one example).

Separately though, sometimes when you want to break the dead hand of the committee-driven, value-destroying hold of a small group of individuals, you have to mercifully push aside the figurehead that provides their credibility. That figurehead is Guido van Rossum, and his post speaks more than what I could about how his time is over.

dalke · on Dec 19, 2014

I have a hard time believing that NumPy is "carrying" Python to the extent you suggest.

While a large number of people use Python because of NumPy, my work in chemical structure analysis and search doesn't depend on NumPy. I use the package about once a year, and few of the chemistry analysis tools I have a dependency on it. I used to be involved with the Biopython project, and while NumPy is strongly recommended for a couple of the modules, it's not a requirement.

Then there's all the people who use it because of Django, and Zope before that. (I remember that the 2000 Python conference seemed to be half Zope developers.) Plus the win32 extensions get about 7,000 downloads per week, implying a pretty active base in that area.

Checking the PyCon talks, only about 5% of them seem to deal with numeric work. (Then again, there's SciPy ... but there's also DjangoCon and local non-NumPy meetings; certainly few in our local user group meeting work with NumPy.)

How then did you draw your conclusion?

Python-the-language did make three changes to support the language: multi-dimensional array slicing, the Ellipsis notation, and most recently the infix operator for matrix multiplication. But I believe that any language which couldn't handle a NumPy-like module would not be that successful in the first place.

Igglyboo · on Dec 19, 2014

I think you're overestimating the amount of people using Numpy. There's no way it's more popular than Django, scientific computing is mainly academic anyways.

elyase · on Dec 19, 2014

In addition the majority of scientist don't install numpy from pipy because it almost never works. Many use the numpy that comes with the distribution(example Ubuntu), a Python distribution (ActiveState, PythonXY, Enthought, Anaconda), or binary installers in Windows. So I would say that numpy usage numbers are a lot higher than what was shown in the other comment.

shadowmint · on Dec 19, 2014

...not by as much as you might imagine.

django

    Downloads (All Versions):
    25082 downloads in the last day
    171415 downloads in the last week
    743271 downloads in the last month

numpy

    Downloads (All Versions):
    12920 downloads in the last day
    80640 downloads in the last week
    327163 downloads in the last month

vegabook · on Dec 19, 2014

I am sorry, and at risk of downvotes, web dev is a much faster (dare I say it more fickle) crowd than science. Python is not a whole lot better than Golang, Ruby and of course, Node.js at this use case. Absolute numbers are not persuasive to me here. There are far fewer scientists than web devs (your stats imply ~ 1:2). Indeed I would venture to suggest that your data means that numpy punches well above its weight relative to other languages' science:webdev ratios. Let's not forget that Numpy and its direct linear ancestor, Numeric, has been dominant for more than a decade. Show me the Python web framework which can say the same. Flask/Bottle/Pyramid/Django/Tornado - pick your fashion.

Another point (to the grandparent posts): putting Numpy into a "science" pidgeon hole is erroneous. It is massively used in finance, engineering, bioinformatics, and statistics.

manoDev · on Dec 19, 2014

> There are far fewer scientists than web devs

... and there can still be more scientists using Python than web devs using it.

Python web development is niche, despite the good frameworks. If you look at job postings, it's eclipsed by RoR, .NET, Java... maybe even PHP.

> Let's not forget that Numpy and its direct linear ancestor, Numeric, has been dominant for more than a decade. Show me the Python web framework which can say the same.

Django was released almost 10 years ago, and I remember it was already popular, a lot of people migrated from Plone.

vegabook · on Dec 19, 2014

The point is that Python's USP is Numpy. Python's USP is not web dev. If Python were to disappear there are many, many credible web dev alternatives. Not true Numpy. Take that from an R, Julia, C user. Numpy has the perfect combo of speed wrapped in an expressive language. I don't see that anywhere else.

elsjaako · on Dec 19, 2014

And I think django tends to have more downloads than numpy. For django I tend to have a separate virtualenv so I know what to deploy on the server (so that's at least 2 downloads, plus one more for every new project I start). When I use python to calculate something/analyze data I don't care about deploying, and I think the scientific crowd is less likely to care about best practices in general.

kevin_thibedeau · on Dec 19, 2014

PyPI download numbers are bogus. More than 90% are from mirror bots, not actual users.

vernie · on Dec 19, 2014

Ah yes, the 250,000 mirrors.

detaro · on Dec 19, 2014

Just a thought, not so certain myself how important it really is, but: The web dev world also might mostly abandon Python very fast, vs once you got a foothold in academia there is a good chance you're going to keep that for years. Having a language in the universities' curricula gives exposure for a long time.

vegabook · on Dec 19, 2014

Exactly. Web dev is a fickle business. See Backbone -> Angular -> React in just 18 months. Python Numpy is much more entrenched.

dagw · on Dec 19, 2014

scientific computing is mainly academic anyways.

...and just about every aspect of engineering and a non-trivial part of finance.

agentultra · on Dec 19, 2014

It's not entirely academic. I, unfortunately, have to rely on it for game development.

sitkack · on Dec 19, 2014

Python 4

vegabook · on Dec 19, 2014

yes please.

thearn4 · on Dec 19, 2014

> The best thing that could happen to Python would be a benevolent, and mercifully silent, retirement of GvR

Just out of curiosity, who do other pythonistas think would be the best candidate for new BDF"L"? Travis Oliphant is one who comes to my mind.

GFK_of_xmaspast · on Dec 19, 2014

Why is one needed at all.

GFK_of_xmaspast · on Dec 19, 2014

I think python has succeeded despite van Rossum, not because of him.

nether · on Dec 19, 2014

It's like asking Linus for his opinion of the App Store.

pbiggar · on Dec 19, 2014

I think this is a really solid approach. It's the same thing I was trying to do for PHP with phc (http://phpcompiler.org/) - compiling down to C using the built in stdlib and C API, then using full program optimization.

Long term, it won't be as fast as a full JIT, as the Facebook HPHP team showed, but Python doesn't have a JIT of the same caliber, so this is probably useful for tons of people.

Dunno if the author is around, but they might find some of the stuff from my PhD relevant, especially how the static analysis worked and some of the challenges of compiling using the C API: http://paulbiggar.com/research/#phd-dissertation

untothebreach · on Dec 19, 2014

This is probably a better link, for those of us (like me) who had never heard of Nuitka: http://nuitka.net/pages/overview.html

ntrepid8 · on Dec 19, 2014

Cached: http://webcache.googleusercontent.com/search?q=cache:0v-pht1...

avinassh · on Dec 19, 2014

Checked contributors list on Github[0] and most of the code is written by a single person, Kay. Isn't that amazing? A project of this large and value, done by single person in his own time. I wish this project gets more popularity.

Bought a beer to Kay :-)

[0] - https://github.com/kayhayen/Nuitka/graphs/contributors

tfeldmann · on Dec 19, 2014

Great project. Compiling a simple PySide application worked fine on the first try. If you bundle all dependencies you get a binary about the same size as a binary produced by py2exe. I will do some speed comparisons next.

jarcane · on Dec 19, 2014

It is somewhat faster than CPython already, but currently it doesn't make all the optimizations possible, but a 258% factor on pystone is a good start (number is from version 0.3.11).

Holy cow. And here I'd set aside Python in part because of the difficulty in getting performant executables (I was getting visible lag in a turn-based SDL game ...)

syllogism · on Dec 19, 2014

Use Cython.

It's a hassle getting your head around it, and getting it set up, but you'll never look back once you do.

If you rely on PyPy, you get what you're given and then you're stuck. You can't really guess how to rewrite your code to make it faster.

A small blog post I wrote on this: https://honnibal.wordpress.com/2014/10/21/writing-c-in-cytho...

Here's a non-trivial example: https://github.com/honnibal/thinc/blob/master/thinc/learner....

This code is driving a library of very fast NLP tools that I'm writing.

shadowmint · on Dec 19, 2014

Unforntuately cython is even harder to package up than normal python because it generates one shared library per module.

...not really suitable for games.

TillE · on Dec 19, 2014

You may need to fiddle around with how modules are loaded by Python, but you can do whatever you want with the C files that Cython generates.

kivy-ios should have all the details for building everything into one big Python blob, if that's what you need.

gcr · on Dec 20, 2014

Amen to Cython. When used well (eg. with C types for variables and a few compiler directives, which is not very painful at all), Cython emits code that is extremely similar to idiomatic C for my scientific computing workloads.

loqi · on Dec 19, 2014

Might want to take "somewhat faster" with a grain of salt. I just tried it on a smallish program (CPU-bound, mostly string operations) and got a 20x slowdown.

towelguy · on Dec 19, 2014

They have a site showing benchmarks http://speedcenter.nuitka.net/ (down at the moment), but here's a google cache of one, showing also the generated C code: https://webcache.googleusercontent.com/search?q=cache:speedc...

bhouston · on Dec 19, 2014

But how does it compare to PyPy?

jarcane · on Dec 19, 2014

To be honest, I found that in my use case (game dev, supported by bindings to a C library), PyPy was actually slower than CPython. I'd be curious to dig up my code and compare performance to a Nuitka-compiled executable.

detaro · on Dec 19, 2014

out of curiosity, which bindings did you use?

jarcane · on Dec 19, 2014

libtcod. It was a roguelike.

ptx · on Dec 19, 2014

Those Python bindings seem to use ctypes, which is indeed slower[1] on PyPy. They recommend[2] using cffi instead.

[1] http://pypy.org/performance.html

[2] http://pypy.readthedocs.org/en/latest/extending.html

illumen · on Dec 19, 2014

For games, other soft realtime, and run once code, pypy is slower. The newer GC is better, but the JIT still causes pauses. The slow warm up of the jit, no type hint saving, and slower interpreter is why it is not good on run once code. Of course if your python extensions are calling optimized assembly routines, or hardware, then it will be faster than pypy code written in python.

Also, using a JIT is not possible on some platforms (iOS, or if the CPU architecture isn't supported by pypy).

Not to say that pypy isn't better for certain tasks of course.

bhouston · on Dec 19, 2014

The solution is easy then, PyPy should have an AOT compilation option, like ART does.

jfeser · on Dec 19, 2014

That's pretty much what Nuitka is. Unfortunately, PyPy is only fast because it can take advantage of runtime information in the JIT. AOT compilation loses that advantage.

ris · on Dec 19, 2014

Not easy at all. ART generally compiles quite static languages (Java). Python is an incredibly dynamic language. The challenges for AOT compilation are vastly different.

simonpantzare · on Dec 19, 2014

It seems to be, by and large, the work of a single person on his spare time. Very impressive!

HerrMonnezza · on Dec 19, 2014

On a tangentially related note: here is a comparison of various Python runtimes (interpreters and compilers including Nuitka and PyPy) on a fairly complex scientific code: http://arxiv.org/abs/1404.6388

The shootout, however, is from August 2013 so getting a bit old by now.

agentultra · on Dec 19, 2014

I would love to make distributing games built on Python easier. Jessica McKeller suggested this should be a priority[0] to make the language more accessible by getting kids involved in making games and being able to share them, easily, with their friends. These days Javascript is kicking our butt in this area.

[0] https://www.youtube.com/watch?v=d1a4Jbjc-vU

I've been writing little helper libraries on top of Python 3.4 + pysdl2 as I've been working on games and demos for my (unfortunately cancelled) Pycon talk. One area I've only played with, unsuccessfully, is getting packaging going on Nuitka or some other compiler. If anyone wants to get together to make it awesome get in touch.

insertion · on Dec 19, 2014

I noticed that PyPy does not show any real speed improvement over CPython in these benchmarks: https://www.techempower.com/benchmarks/

If you filter by Python, you can compare some results running on both CPython and PyPy. I would be curious to know what it is about these benchmarks that makes PyPy perform poorly. I would also be interested to see how Nuitka performs.

At the moment I'm also very excited about Pyston from Dropbox: https://github.com/dropbox/pyston

Beltiras · on Dec 19, 2014

I'm supremely sceptical about those benchmarks. Go take a look at the code being tested. I would welcome a serious look at developmental time versus resource used, using code that is probable in production. I read the code used to test Django. It's not reasonable code.

saym · on Dec 20, 2014

It's an open sourced benchmarking comparison. If you can improve them, shoot them a pull request!

https://github.com/TechEmpower/FrameworkBenchmarks

gamesbrainiac · on Dec 19, 2014

I think pypy is of interest because of pypy-stm that attempts to circumvent the GIL. Not only for speed benefits.

wyldfire · on Dec 19, 2014

Correct me if I'm wrong here, but isn't that only helpful if you have multithreaded python programs? I have found that if my process is too slow, I can consider porting it to numpy/numba, cython, using pypy or dividing up the work using multiprocessing. multiprocessing is barely more work than using the threading module, and completely avoids the GIL AFAIK.

gamesbrainiac · on Dec 20, 2014

Well, multiprocessing (the modules) is problematic because not everything can be pickled. If you are working with simple functions this can be fine, but often we use third party libraries that use lambdas (that can't be pickled). Often you don't know why something can't be pickled.

What STM (as I understand from blog posts from the pypy team) is that it provides real threading without having to worry about pickling.

And yes, it only helps if you have multithreaded programs. However, multithreaded and parallel programs are very relevant today.

ant6n · on Dec 19, 2014

I found that PyPy sometimes has unexpected slowdowns. When we were porting from Python to PyPy on some offline processing tools, the most crazy one was building strings via += and sum(arrays,[]), which is much slower than cpython.

Fede_V · on Dec 19, 2014

There was a good blog post this by Armin. Basically, if you have to concatenate strings, don't do so via += but use "".join([]).

ant6n · on Dec 20, 2014

I find that unexpected. Java has had string builder optimization for a long time, and CPython is much much faster in this respect. It's not always easy to use "".join when using a string, so you end up having to build a separate array of strings in some cases. And building arrays isn't always that fast either. And [].join doesn't exist, so summing arrays is always kinda slow.

Anyway, all that is to say: I really like PyPy, and we use it a lot, but those _unexpected_ crazy slowdowns are unfortunate.

3am · on Dec 19, 2014

Why are so people so fast to trust a new compiler?

Have you audited that compiler's source and building the compiler from it with a known good compiler? Are you inspecting the resulting .pyc, and in this case, the resulting PEs? It's a super easy way to inject a compromise into a package that will probably get widely distributed.

full disclosure: link is down for me, so haven't read the article. Been a comment that has been building up for a while for me, and not specific to Nuitka. Same goes for new frameworks/languages/etc.

coldtea · on Dec 19, 2014

>Have you audited that compiler's source and building the compiler from it with a known good compiler? Are you inspecting the resulting .pyc, and in this case, the resulting PEs? It's a super easy way to inject a compromise into a package that will probably get widely distributed.

Because nobody is that paranoid?

hughes · on Dec 19, 2014

And if they are, they probably aren't in the business of distributing precompiled binaries.

waxjar · on Dec 20, 2014

I think it unlikely someone would design something complicated like a new compiler / programming language, open source it and attach their real name to the project just to hide an exploit in it.

There's much lower hanging fruit.

andreasvc · on Dec 19, 2014

Why limit yourself to new compilers? Every new version of an old compiler could be suspect, every line of code in general. There's no point in being this paranoid, nobody has time for all the audits that might be theoretically desirable. Almost everything you do on a computer relies on trusting an untold number of components.

bhouston · on Dec 19, 2014

Mirror: http://web.archive.org/web/20141218012211/http://nuitka.net/

BTW how much faster is this than PyPy or CPython? That is the question that I think is most on people's minds.

lucashn · on Dec 19, 2014

Seems to be 2x faster than CPython, but as I understood it is working on optimizations now.

mkesper · on Dec 20, 2014

As others have pointed out, these claims have to be taken very carefully and the main value (at least for now) seems to be the easy packaging and deploying.

Beltiras · on Dec 19, 2014

I program Django websites and work in an environment where we like to deploy often. I'm not sure we could tolerate the extra wait in compile time for deployment. At least we would need to take a serious look at the speed/resource gain before considering it.

robertfw · on Dec 19, 2014

Deployment in a controlled environment is not the ideal use case here - the benefits come from when you are shipping to non-technical users.

andybak · on Dec 19, 2014

It's been said before but it's fairly unlikely that Python is the bottleneck for typical web-apps.

shadowmint · on Dec 19, 2014

For all that people may think this isnt a good idea, this is a very similar idea to what Unity3d is persuing with IL2CPP (http://blogs.unity3d.com/2014/05/20/the-future-of-scripting-...).

This seems immidately relevant to address pythons biggest problems; its slow, and its hard to distribute. Relevant quote:

    All code generation is done to C++ rather than architecture 
    specific machine code. The cost of porting and maintenance 
    of architecture specific code generation is now more amortised

Beltiras · on Dec 19, 2014

Git repo is unresponsive. I think we killed the poor guys box.

EDIT: https://github.com/kayhayen/Nuitka

bane · on Dec 19, 2014

I'm very interested in some benchmarks on this and various other compilers/packagers/runtimes for Python.

Sidenote: A very long time ago I had some great luck with Perl's packagers: PAR, PerlApp, etc. but they all basically just bundle in the basic Perl runtime with the guts of your program so there's no real benchmark difference. I suspect the same is not true of efforts like this for Python.

dannyroberts1 · on Dec 19, 2014

This is awesome! I tested it on some code that uses some not-compeltely-trivial python features like metaclasses, etc. and it worked like a charm.

I couldn't get it to work against this obfuscated hello world though: http://benkurtovic.com/2014/06/01/obfuscating-hello-world.ht... :)

mkesper · on Dec 20, 2014

That one's relying on CPython implementation details, the author says so in his detailed description.

PythonicAlpha · on Dec 19, 2014

I really like the idea, that it compiles Python but is (99,9%) compatible to CPython -- since so you can still use the many C extensions (I don't want to do my projects any more without my own Python extensions written in C).

zongitsrinzler · on Dec 19, 2014

I don't know much about compilers and runtimes but how come there are SO MANY of them for Python. Is there something about Python that makes it so easy to write runtimes for it?

bane · on Dec 19, 2014

I think it's because lots of people write lots of code in python, but then end up with performance problems at some point and try to tackle those. Have a nice regular language makes it relative easy to rewrite a runtime to scratch your itch.

I suspect we would have seen this more with languages like Perl, except Perl is virtually impossible to parse.

dalke · on Dec 19, 2014

I think it's a matter of numbers. There are a lot of people who use Python. (https://blog.pythonanywhere.com/67/ estimates in the low millions.) Some of them like to work on alternate implementations of Python.

I think Python is average in this respect. By comparison, here's an incomplete list of C compilers: http://en.wikipedia.org/wiki/List_of_compilers#C_compilers and you can also look at the list of Pascal compilers.

_ondq · on Dec 19, 2014

I think it's largely because while a lot of people really love the language, they struggle with the performance and scaling issues that arise with the standard CPython implementation.

njharman · on Dec 19, 2014

Python is the best "glue" language. Large 3rd party libs, easily readable/writable (even by non-programmers), ld (it predates Java), established, great user community.

So, people in different realms (MSoft IronPython, JVM/Enterprise Jython, restofus CPython, scientists/performance Cython/PyPy,Stackless,Psyco, academics/experimenters many more) all want to use it and make versions that integrate with their tools.

mhd · on Dec 19, 2014

Popularity and a syntax that can be parsed, plus being widely taught. Pretty much the same reason why we had plenty of Pascal compilers back in the days.

And quite unlike why we have that many JavaScript transpilers.

A few of the compilers might be running on the AST, and thus have no complete alternative implementeation. A good module support for that helps, although it's still not close to VM-based languages like Java.

kedean · on Dec 19, 2014

My immediate question, which I don't see an answer to on the site or Github page: is it self-hosting yet?

est · on Dec 19, 2014

Python is a fun language, it has as many web frameworks as its runtime/interpreter.