Hacker News new | ask | show | jobs
by mpdehaan2 3901 days ago
I've been seeing a lot of Python vs Go stuff lately and I think a fair amount of the folks involved in these are not aware of general Python web architecture patterns.

Of course something compiled directly is going to be a bit faster, but development time is important too. Python has more libraries and is (for many people) probably faster to write.

Serving multiple requests is best utilized using a preforking webserver in front of Python, whether Apache, nginx, etc. This allows multiple requests in without any async voodoo code. Twisted for example is not the right answer in this case, because it doesn't get you multiple processes and messes up the way you write code (async event driven code is more time consuming to write/debug).

On the backend, your webserver does not start longrunning backend processes, but you can launch them using things like celery, which is a process manager that allows you to start jobs and so forth. Celery can run on any number of machines, and your backend can scale independently of your frontend if you wish.

Historically, some very computational parts of Python were often written with C bindings. While I haven't done so, things like Cython may also be promising for extensions. There's also things like ctypes for quickly just taking advantage of native libraries in a Python function.

Personally, given, I like how Go has things like channels, but I would never adopt a programming language for just one specific feature when I lose out on other features that are valuable to me, for instance, an object model.

(I'm also really curious to see how the typing options in Python 3 play out)

Anyway, I mostly wanted to point out as most people are doing web services that you should be fronting Python with some sort of web server that allows preforking, and then the concurrency issue, in my experience, becomes not a thing.

Many backend libraries can easily take advantage of libs like microprocessing, which are not the most 100% friendly in their more complex IPC-type cases, but are pretty workable.

5 comments

I've been seeing a lot of Python vs Go stuff lately and I think a fair amount of the folks involved in these are not aware of general Python web architecture patterns.

I absolutely agree, but I also think that deploying python on the web still has too much of a learning curve. Even the standard nginx > gunicorn > wsgi model is kind of a pain. Couple that with celery, and init systems, and you're basically down a sysadmin rabbit hole.

> Anyway, I mostly wanted to point out as most people are doing web services that you should be fronting Python with some sort of web server that allows preforking, and then the concurrency issue, in my experience, becomes not a thing.

Spot on. Concurrency vs parallelism, and clean distinction of responsibility (web server vs backend threads).

> Many backend libraries can easily take advantage of libs like microprocessing, which are not the most 100% friendly in their more complex IPC-type cases, but are pretty workable.

This is painful. In Javascript I can be careless (well to a great extend for people like me likes magic) using promises. Python can achieve this too but with a great effort of learning either coroutines, gevents, or asyncio. Though I have to admit that Javascript has its own problem facing parallelism.

I have done things with gevents, spawning greenlets and respond to user immediately. The thing is, backend should always be stateless, so worker models like celery and rabbitmq pub/sub and etc are more popular.

> Personally, given, I like how Go has things like channels, but I would never adopt a programming language for just one specific feature when I lose out on other features that are valuable to me, for instance, an object model.

That really depends on your requirements. If you need multithreading (not multiprocessing), you cannot use Python.

That really depends on your requirements. If you need multithreading (not multiprocessing), you cannot use Python.

About to show my ignorance, but when is multithreading useful when multiprocessing is not? Assuming it is a use case that is suited for a high level dynamic language to begin with.

Multithreading lets you share memory. Multiprocessing requires you to serialize your data and copy it to the other processes. That can be very expensive.

This only matters for performance-critical applications.

Let's say you have a REST request come in. In order to fulfill that request you need to make 10 REST requests of your own to various back-end systems. If you can make some of those requests asynchronously you'll greatly reduce your response time. While I suppose you could mangle your way through it multi-process, it seems like a bastardization of the model.
If you're just sending HTTP requests, regular threading in Python works fine - waiting for a socket response doesn't block the execution of other threads.

Python threads are real threads, and things like blocking socket IO does not block Python execution in other threads.

While I suppose you could mangle your way through it multi-process, it seems like a bastardization of the model.

That kind of sounds like you're nitpicking over an implementation detail. Which brings me back to my original confusion. If I have an interface that allows me to articulate concurrent tasks, I don't see the difference between threads, processes, or separate machines.

Again, this is assuming we're not in a resource constrained environment, and we're not doing something so processor heavy that it really should be compiled.

Think I'd rather use asyncio in this case. Assuming most recent Python might not be fair though I guess.
It looks like a job for an event loop (tornado/twisted/asyncio), not for a thread pool.
> If you need multithreading (not multiprocessing), you cannot use Python.

If you need multithreading of Python code for parallelism, you cannot use CPython (and, consequently, can't use Python 3.)

Both IronPython and Jython use native threads without a GIL.

The GIL only restricts utilization of multiple cores. This makes parallelism harder but not concurrency. Python fully supports threads, but you won't find two threads in the same process running at the same time.
Small clarification - you will if they are down in C code. i.e. you can have multiple threads in Python blocking on IO just fine. Or three threads concurrently running C code which releases the GIL until it needs to be re-acquired. Quite a few standard libraries which require significant processing power do exactly this.
I think you're right, but on a related topic, I've yet to find something that works as easily as Go's select statement in Python.
> Personally, given, I like how Go has things like channels, but I would never adopt a programming language for just one specific feature when I lose out on other features that are valuable to me, for instance, an object model.

Go has an object model.

> Go has an object model.

I'm no sure go creators would agree with this. The go object model is different enough from other object models that it would not qualify as one for a lot of people.

Object models don't all have to be identical. Go doesn't have the same object model as Java, but it still has one. Saying it doesn't is simply wrong.
These are great ideas. I told Valentin to drop by. Because DAS is an aggregator with an expert-system style query language, there is sharing among the Python services for caching. It's the caching that makes async code a good option. Preforking might work against this without significant increase in complexity in order to communicate with a single local cache. Remember that this is a web service for the Large Hadron Collider. Nothing about that is small.