Hacker News new | ask | show | jobs
by saynsedit 3535 days ago
I've actually written a highly optimized asyncio-based Python web server in the past. I meticulously optimized every component, using all the standard CPython optimization techniques (heavy use of stdlib, minimizing method lookups).

In the end, my implementation was nowhere near nginx. I even ran it under PyPy and it fared no better. Then I realized the oxymoronic nature of writing an optimized web server in Python.

5 comments

If you wanted to run in PyPy, you should've written your code to be PyPy-friendly from the beginning. Standard CPython optimizations don't usually fare well in PyPy, and you can end up with even slower speed if you follow those. Still, you didn't mention profiling the hotspots of your server, nor using Cython to optimize them.

I don't see how writing an optimzed web server in Python becomes oxymoronic -- it can still be the fastest performing python server, and have a valid use case for those that want to work in Python.

Your PyPy point is fair. A PyPy-oriented optimization effort could have improved performance. N/s by what magnitude though.

I did extensive benchmarking until there were no more hot spots, didn't help. Dropping into C or Cython was a non-goal. For dev an embedded Python web server is convenient but it doesn't have to be fast. When it comes to performance, it always makes more sense to use a native-code web server in production.

I mentioned profiling and benchmarking because they were missing. Totally agree with you, Python implemented servers are usually used in dev ops, and then you move to a "standard" http-server (be that nginx, apache, whatever).

I still think that a Python server can have it's use, and there's nothing wrong in trying to make it as fast as posible.

I'm curious, why were C/Cython non-goals?

"Dropping into C or Cython was a non-goal."

Why not Cython?

at the time, I preferred to write it completely in C++ than have a mixed Python/Cython codebase.
What does "not even close" mean in this context? I find it a bit interesting that the most obvious conclusion from the benchmarks in the story, and the linked one - is that uvloop and go are high-performance and consistent low-latency, high performance python is on par with nodejs - but in general the real jump up is towards go and c++(?).

It's great to see an order of magnitude leap on the python side - but on the face of it I'm not sure the leap really is enough to enable a different class of services in python? Perhaps I'm being too pessimistic - I know I'd be happy to be able to "ignore" node, and only consider eg: Python for most things and go for some things. Just to limit my tech stack.

But where does nginx with lua fit - is it another order of magnitude above go for dynamic content?

Re "not even close": At least 10x slower

This was tweaking Nagle, using edge triggered epoll, using sendfile(). Not just Python-optimizations.

Honestly you sound confused.

Performance is not a magical inherent property of a language-and-webserver combination, and will vary wildly with the application. In my experience practical performance has much more to do with architecture, algorithms and "the other bunch that makes things efficient" -- not so much with a language.

It's also only one facet. And usually not a very important one, either.

> Performance is not a magical inherent property of a language

well it is. however most performance characteristics of a language are well understood. mostly python/ruby is slower than a lot of other languages.

> practical performance has much more to do with architecture

well sort of not every case does well with these kinds of languages.

however in most cases it's just fine to use them.

my company changed one product from python to scala. everything was slower in python, however in 90% of our use case that didn't even matter. however we were thread and calculation (i.e. shuffling/changing large lists/maps in memory) where python was just slow. I guess we could've written a library for these kind of transformations in C. however another problem was also PDF generation, which was really really slow in python (for bigger pdf's, slower one's we just fine). we are happy to use scala, but I didn't found python bad or weak. you are pretty fast and the tooling is just amazing. also the ORM's in python are superior to everything i've seen on the JVM world. (Django ORM and SQLAlchemy) I guess they are even the best ORM's out there. if I would be developing more towards a cloud architecture I would probably use python again. you could just do more if you have room for a "infinite" amount of servers

still python is a really great and well designed language. I would everybody encourage to look into it.

True, but so what?

If you wanted to create a static files server, Python is just a bad choice.

Python is a good choice if you need custom, complex logic or access to non-file data stores. And if you need that, Nginx isn't a choice (yes you can use modules -- eventually you recreate the same problem).

So yes, don't try to replace Ngnix or HAProxy or PostgreSQL or such with Python. But the gaps not covered are good to be written in Python. And it's nice if they're fast.

> In the end, my implementation was nowhere near nginx.

How could it be?

Well it could be near, like 2x slowdown, instead of the 10x slowdown I saw.
As a matter of interest, what makes nginx so fast?
Tight C, and more importantly, very little dynamic/runtime support.

(Eg, nearly static configuration, very few flow or looping constructs, unless you explicitly use something optional like Lua)

> unless you explicitly use something optional like Lua

I've actually at one point considered writing some light web apps as nginx modules, in Lua. In the end though, it's usually enough to pluck the lowest hanging fruit, which (as a rule is) caching... Cache as much as possible (including when you have a python web app behind nginx).

You might be interested in OpenResty.