Hacker News new | ask | show | jobs
by kyleburton 5152 days ago
This. The biggest difference is in the two cultures: blocking is anathema to the Node.js community - they will literally reject libraries or code that blocks because it destroys the entire model; the JVM community does not value non-blocking code - most of the core (JDBC, Networking in general, File system operations) is all written in a blocking style - the JVM community accepts this with the implicit assumption that threads will help assuage those issues.

Python, Ruby and Perl all have the same cultural tolerance for blocking code. The Node.js community has a complete lack of tolerance for blocking code.

I work with the JVM every day (Clojure) and wish it was different wrt the common use of non-blocking code, but it's going to be a long road to get there on the JVM.

Kyle

4 comments

Java Executors combined with Guava's ListenableFutures easily turn any blocking operation to an asynchronous one.

Netty's entire model is asynchronous, and Java 7 now has AsynchronousChannels for IO which, I assume, Netty will make use of.

All in all, the JVM has a much more solid and performant foundation than anything Node can provide. The whole difference will come down to a programming style preference. I am not entirely sure why Vert.x adopted the Node style rather than the proven servlet container, as I'm sure both styles provide comparable performance. I guess each may shine under different loads/usage patterns (my guess is that Vert.x/Node can squeeze more performance from a single thread, but servlets are more scalable).

There is no async MySDQL JDBC driver. If you encapsulate it in an async layer, you need to keep a thread for the connection.
Is that supposed to be bad? The programming style will be the same. If threads are done right, and the JVM can manage their affinity well (especially on NUMA architectures), it's best to use them and pass a relatively small amount of data between them, then they can provide much better performance than accessing the same large piece of RAM from many threads (that's what happens if you simply replicate a single event-loop thread with asynchronous IO).
Maybe I am missing something, but how can you possibly have an async SQL driver without threads like this? This sounds like a case of your Node.js database driver hiding the exact same behaviour described here within C code.
If the wire protocol for the driver is published, then you can write a 100% async driver for it. I.e. no threads blocking, ever. In fact, I already did this for redis and vert.x (I will dig out the code for this some time).

If you are dealing with something where you don't know what the wire protocol is and you just have a blocking client library to play with (e.g. JDBC - JDBC is, by definition blocking - see the JDBC API), then you can't do much but to wrap the blocking api in an async facade and limit the number of threads that block at any one time. This is exactly what we do in vert.x. We accept the fact that many libraries in the Java world are blocking (e.g. JDBC) so we allow you to use them by running them on as a worker. This is one area where we differ from node.js. Node.js makes you run everything on an event loop. This is just silly for some things, e.g. long running computations (remember the Fibonacci number affair?), or calling blocking apis. With vert.x you run "event-loopy" things on the event loop but you can run "non event-loopy" things on a worker. It's a hybrid.

A limited number of threads will not scale as real async wake-on-data connections will scale. If demand is higher than your thread pool, for the use case that you're web response builds on async backend requests, your site will be down.
PostgresSQL's libpq supports nonblocking asynchronous operation, and node-postgres takes advantage of that.

http://www.postgresql.org/docs/9.1/static/libpq-async.html

https://github.com/brianc/node-postgres/blob/master/src/bind...

(notice the Connect method on line 325 of binding.cc)

At some level a client-server database driver isn't all that different from any other network client; you send a request over a socket and wait for a result. There's no reason you have to block while waiting.

Moreover some databases (like Postgres) let you receive asynchronous notifications signaled by transactions on other connections; that's how trigger-based replication systems like Bucardo do their thing.

http://www.postgresql.org/docs/9.1/static/sql-notify.html

Because it would be based on asynchronous socket responses. So you wouldn't iterate like you currently do w/ a ResultSet but rather have a simple "RowHandler" or sorts. However you still run into the trouble you do w/ node if you decide to do a lot of blocking work in there instead of just sending the row to some ExecutorService thread to get worked on.
Yes, but the lib can do it for you, and you won't have to know a thing.
What are you talking about? libevent invented everything Node.js uses and Python's Twisted had and has everything Node.js could dream of.

Node.js is just a reinvention of old technologies in Javascript.

Edit: Removed flame about Javascript because I don't want to have this debate again.

I am talking about the cultures surrounding these languages and frameworks. Node's community rejects blocking libraries. Java's does not. I've used the non-blocking frameworks in Perl (POE and my own), C (select, and some of the poll variants), Ruby (event machine) and they are fine if you can avoid blocking libraries -- in these communities it is generally acceptable to write blocking libraries. I don't see it as a technical hurdle, I see it as a cultural one.
You need to start backing up your claims with actual data. What networking libraries are blocking in Node.js that are not blocking in Twisted? Moreover, what can't you do with a Twisted Deferred that you can do in Node.js?

There are two main things that block in computing:

* I/O

* CPU

You better believe that Node blocks on CPU, so what I/O does Node not block on that Twisted does?

I am talking about the cultures surrounding these languages and frameworks. Node's community rejects blocking libraries. Java's does not.

So what?

You get tons of Java libs to use, a majority of blocking ones and lots of non blocking on one hand, or you restrict yourself to the fewer non-blocking libs of varying quality available for Node.

With Java you can also turn blocking libs to non blocking with a wrapper and threads, whereas with Node.js if it's blocking you're screwing, because the js engine is single-threaded.

Node is nearing 10k published libraries. Is there a comprehensive site listing Java libraries?

It's trivial to offload blocking operations to other processes in node too, it's just not the preferred option.

Node is nearing 10k published libraries.

And ruby has 38237 gems. 99,9% of which are garbage.

Library-count is a terrible metric.

Have you ever programmed in Java? That's a strange question to ask if you have.

http://mvnrepository.com/

Whatever libraries node has is a drop in the bucket compared to java.

I haven't, thats why I asked. Node is just 2 years old. You'll get 90% of garbage on any open package manager.
You make it sound like "rejecting blocking libraries" was some noble principle. Due to the severe limitations of JavaScript, there is no other choice.
We run a major site on Java, had some thread trouble years ago in the very beginning, works very well now when tuned. Threads work.

BUT: I assume people will move to backend services with REST and combining REST backend results to a page. This increases IO a lot and will kill your latency and default thread models when you do sync code. You'd need to use async IO, composeable futures to manage latency and thread count. And if you do async backend REST, why not do async JDBC etc. But there are no libraries.

>This. The biggest difference is in the two cultures: blocking is anathema to the Node.js community - they will literally reject libraries or code that blocks because it destroys the entire model

Really? So they reject any kind of library that does anything except call a callback? Because everything else, from calculating 2+2 to creating a template blocks. And it doesn't matter when it happens, when it happens it blocks.