Hacker News new | ask | show | jobs
by spasquali 4378 days ago
It's great to remind everyone how Node follows various rules of the Unix Philosophy, and how it is designed to make process spawning/streaming as natural as on the OS.

I would prefer it though if the implication wasn't that a failure in Node's design is responsible for the failure of this in-process-memory technique of sorting massive data sets. From the article:

"However, as more and more districts began relying on Clever, it quickly became apparent that in-memory joins were a huge bottleneck."

Indeed...

"Plus, Node.js processes tend to conk out when they reach their 1.7 GB memory limit, a threshold we were starting to get uncomfortably close to."

Maybe simply "processes" rather than "Node processes"? -- I don't think this is a Node-only problem.

"Once some of the country’s largest districts started using Clever, we realized that loading all of a district’s data into memory at once simply wouldn’t scale."

I think this was predictable. Earlier in the article I noticed this line:

"We implemented the join logic we needed using a simple in-memory hash join, avoiding premature optimization."

The "premature optimization" line is becoming something of a trope. It is not bad engineering to think at least as far as your business model. It sounds like reaching 1/6 of your market led to a system failure. This could (should?) have been anticipated.

1 comments

To some extent we knew that in-memory joins would eventually cause problems, but we were certainly surprised at how quickly Node memory usage became the bottleneck. Here's a little gist I used to test it a while ago https://gist.github.com/rgarcia/6170213.

As for your point about premature optimization, in my opinion a startup's first priority is getting something in front of users in order to start improving and iterating. The first version of the data pipeline discussed in the blog post was built when Clever was in 0 schools, so designing it to scale to some of the largest school districts in the country would have been fairly presumptuous.