Hacker News new | ask | show | jobs
by KwisaksHaderach 1594 days ago
It's fantastic for I/O-heavy 'server'-type code

I have found nodejs frustrating for CRUD APIs, the is 'fantastic for heavy I/O' while true ends up guiding you to complex microservices(or trying to use a external services for everything) because you don't want to slow down the event loop as you add more and more cpu processing to your app.

3 comments

Interesting! I'm not a Node expert specifically, just a general systems programmer, but I might be able to give you some pointers.

Is your application I/O-bound, or CPU-bound? i.e. what's the bottleneck, which you would have to increase in order to speed it up? Your comment is a bit ambivalent given your remark about adding "more and more CPU processing".

If you're I/O-bound, then you're free to do more CPU processing. If/once you're CPU-bound, there are a few questions:

- Are you using all your cores? Node is single-core, so you may need to run one Node process per core. This obviously depends on how parallelisable your program is. Edit: u/eyelidlessness has given some more Node-specific suggestions that may allow true shared-memory concurrency, or even shared-memory parallelism across several cores.

- Are you able to increase your CPU's clock speed? (This obviously assumes a cloud environment or something similar, where you can easily swap out CPUs. I'm not talking about overclocking.)

- Have you profiled exactly what is using so much CPU? Is there some wasteful computation you can remove? Try `node --cpu-prof` to generate a profile. If you're unfamiliar with analysing profiles, Brendan Gregg's blog is the place to go. This article from the guy who wrote Sled is also a very good longread: https://sled.rs/perf.html

I'm surprised if you're really using so much CPU in a Node application, at least if it's a typical CRUD one. I'd strongly suppose that you're doing some wasteful computation, either in an algorithm in your business logic, or else in inefficient JSON parsing. Let me know if you can give any more info :)

Edit: It looks like u/eyelidlessness has given some more Node-specific tips for improving CPU saturation. I'd definitely check out the pointers that he/she gave.

If your workload is actually CPU-heavy, Node’s solutions for this are several:

- worker threads - child processes/IPC - native bindings (n-api/gyp/etc) - WASM/WASI

Maybe so many as to cause choice/research paralysis. If you’re primarily interested in writing JS/TS, worker threads are a great option. And for most use cases you don’t need to worry about shared memory/atomics, postMessage has surprisingly good performance characteristics.

Workers threads spawn a new JS VM( which implies a new GC) for each worker! We tried it and the gains from parallelism stopped when there were half as many workers as CPUs.
They do create a new VM isolate. That’s a cost your workload will need to exceed before you get much if any benefit.

As far as thread count, my default guidance is 50% of cores if your CPU has SMT/hyper threading, coreCount - 2 otherwise.

Those numbers can go higher depending on how much of your workload is CPU-heavy. If you have an even mix of compute and IO, for example, your threads will frequently be idle/less contentious (same principle as the single threaded event loop).

And if your workload is a queue of short lived, isolated steps, I also recommend pre-spawning idle threads (potentially well beyond that active maximum). Pre-warming those isolates can help as well, or isolating with vm APIs (eg SourceTextModule) instead.

As with, well, everything: your mileage may vary, it depends on your actual bottlenecks and constraints, as well as your tolerance for tuning.

“More and more CPU processing” doesn’t sound like a CRUD API.
A CRUD API doesn't mean "just talks to the database and nothing more", a simple example and one of many is how PDF generation can f*ck up your event loop.