Hacker News new | ask | show | jobs
by maxpert 3171 days ago
Shared data structure among multiple threads... this sounds utterly fimilar and evil! Redis is single-threaded, probably one of fastest,has data different structures, can handle high loads, code is easy to reason, something that just works.

One of the reasons Node is successful is the simplicity of single threaded code. Way easier to reason, I would question the usage of Node if you are doing something CPU bound with it. You can use golang or C# with tasks for that.

7 comments

Think outside of web workloads! Not everyone is writing a web app.

Think about something like Delaunay triangulation or mesh refinement. These are critical path bottlenecks for a great many applications and in practice very parallel, but they're irregular so we cannot easily distribute the data structure. The best results we have are for shared memory thread models. We don't know how to do it any other way!

That's why the post you just replied to suggested that Node may not be the right tool for those kind of tasks.
The problem is that I see a lot of projects written in languages with single-threaded runtimes (python more often than node) that become difficult/expensive to scale and extend down the road. I loathe the idea of a rewrite, but sometimes the initial language choice lacked forethought to the point where it makes sense to rewrite in something that actually can make use of all of a machine's processing resources.

Things like greenlet and gevent (and likely napa.js) are band-aids over the underlying problem.

For these workloads, I would consider a compiled language with great parallelism/concurrency. e.g Rust or GoLang.
So just because shared memory is hard you are ok with sacrificing performance and replacing memory access with io hops? That sounds like an overkill and not suitable for every task.
I like node.js and use it very often (had my first package reach over 100 stars, woohoo :) ) but I don't understand why it needs to be suitable for every task.

If you really want to do something creative with the shared memory, I guess you could do that in a "native module" written in c++ or even Rust[1].

I'm not saying that it's not doable with JS, it's just that it's already been done (as in, has a solution that works).

[1]: https://github.com/neon-bindings/neon

Why should i learn a new language for that? It's good to have as many options as possible in js and you take the one that fits you best.
Because JS isn't good at everything just like C++ and Rust aren't good at everything.

Right tool for the job.

But if you take that to an extreme, you end up with a hundred tools. I think it's good to have the option to do parallel computing in JS, for those times when it's worth the tradeoff versus having to adopt a completely new language/platform.
Sure, anything taken to an extreme is bad. I wasn't suggesting to do that.

I think if there is a sensible use-case for parallel computing in JS, it would be good to have. However, trying to make a solution before we have a (clear) problem is foolish.

I'm not saying there isn't already a use-case, but I haven't seen one that isn't already covered by languages better suited to solving those problems (e.g. Rust).

Edit to give a different example: parallel computing in JS is like trying to write a web framework in Rust. Sure, you can do it, but Node is already better suited to doing that. At best, you're making a worse version of something that already exists.

We have been down this road before. When you have a lot of options "for those times when" you get a lot of abuse. All good frameworks remove choice to prevent a spiraling string of fuckups by people who don't understand what is going on behind their code. For the few of us who do know what is going on it is not a problem but you have to consider all of the code monkeys who are going to be using a given framework without supervision. What would happen if we made everyone program enterprise CRUD applications in C++ from scratch....unmitigated chaos that would lead the business to disavow technology and go back to paper filing cabinets.
When all you have is a hammer...
... everything looks like a nail!
>Why should i learn a new language for that?

You make it sound like it was difficult to learn. Underneath, C++, Java, Pascal, C#, Javascript and Python, have many similarities and jumping from one of those languages to another in the list is very easy; compared, for example, to something like jumping from any of those languages to Forth, PROLOG, SQL, ML, Haskell, or Lisp.

Some of them are also really similar syntactically, for example this group: [C, C++, Java, C#]; or this other group: [Pascal, Algol, Go], so even the syntax doesn't get in the way when jumping from one to other.

Thus, usually, software engineers do know more than one language and they apply what better suits the program.

Because languages are tools, and you should learn to use more than one tool. If you know JavaScript, you basically already know most C based languages syntactically, it's very little effort to at least learn one of them for tasks JS isn't suited for.
But what about dealing with ffi, build dependencies and toolchains? Sounds like it is just shifting complexity in to a different place, not actually solving it.
Have you read the docs? Just go read this https://github.com/Microsoft/napajs/blob/master/docs/api/mem... and tell me you don't feel uncomfortable. This is the kind of baggage you are bound to get with such solutions and once you end up writing a steaming pile of code, then you need a thread safe logging and debugging story.
> Shared data structure among multiple threads... this sounds utterly fimilar and evil!

This seems like a bit of a FUD.

With multiple threads and shared data, you don't necessarily have to share all the data structures with all other data structures and all the threads. You can setup your things such that minimum or nothing is shared. That's (also) what access control and immutaibility is for in programming languages, apart from other features.

Of course, different languages support these features in different ways, I don't want to get into the specifics, but in pretty much all mainstream languages you can create a similar share-nothing or share-almost-nothing design and it's not even hard, it might even be easier.

I really don't understand modern web/JS developers. They seem to ignore traditional solutions and/or proclaim them as evil, and then they go on to employ a 'new' solution that is 3× as complex, performs 5× worse and requires 10× as many dependencies/tools/frameworks/etc. Why? I suspect there's a LOT of largely irrational fear of concepts and languages that are unfamiliar. "Fear driven developement" in fashionable lingo.

TL;DR you don't need to be scared of threads, you just need to be scared of threading architectures that share too much.

>I really don't understand modern web/JS developers. They seem to ignore traditional solutions and/or proclaim them as evil, and then they go on to employ a 'new' solution that is 3× as complex, performs 5× worse and requires 10× as many dependencies/tools/frameworks/etc.

It is, perhaps, because a significant amount of Node.js developers came from front-end-only development, thus unfamiliar with the traditional approaches (in this case, using threads). An example is the many cases in which a document store as MongoDB is (wrongly) used for data that is mostly relational.

Simply put, they never were taught the traditional approaches first.

Basically your argument boils down to "it's easier to write single-threaded code than multi-threaded". Well no shit, but the benefit is in many cases colossal, so I'd say that's not a good argument to dismiss this complaint.
> Redis is single-threaded, probably one of fastest,has data different structures, can handle high loads, code is easy to reason, something that just works.

Redis probably isn't a great example here. I've worked on projects where a single Redis instance was not enough (would easily peg its single CPU to 100% and have query latency in the multi-second range). In the end, sharding the data among several Redis instances was successful, but also brought its own problems. The ideal is that we just have languages, runtimes, data stores, etc. that abstract these details away from us so we can focus on our application logic, not on how to make it faster.

To be fair though, one of Redis's biggest weaknesses is its single threaded nature in instances where you, e.g., have huge sets and need to compute expensive set intersections/etc...

Redis also might not be the best choice if thats your primary use case...but still.

Once upon a time nobody seriously thought JavaScript would ever play any role outside of the browser, and even the role in the browser was small enough many people preferred to disable it.

Then we got a very fast JIT, and suddenly you could do reasonable compute heavy stuff very fast, and then it became viable to also write the server side in JS, because of programmer efficiency and library reuse and other reasons.

The "right tool for the job" can seriously change when tools improve and develop, and just because there already are other tools for the same job should not stop anybody from trying.

I can not think of a better example for that than JavaScript.