Hacker News new | ask | show | jobs
by ploggingdev 3227 days ago
Interesting post, more so because I am working on a project that has many characteristics of a link aggregator.

While implementing the ranking algorithm, which is very similar to the one mentioned in the article, I decided to run a periodic job every 60 seconds that updates the rank for each submission and stores it in the database so querying the ranked data is more efficient than recalculating the rank on every page request. Are you doing something similar or did you take a different approach?

Ranking all stories works well if the total number of submissions is a small number, but I imagine the approach is a little different for large sites like HN. Ranking all submissions periodically seems like a waste since people rarely view submissions beyond 10 pages. One approach is just to rank submissions from the past n days, where n depends on the average daily submission volume.

For the part that displays time since a submission was made, I implemented the HN model, where it displays only minutes, hours and days. Python code here : https://dpaste.de/5d1w

> Elyxel was designed and built with performance in mind. Styles and any additional flourishes were kept to a minimum. My choice of Elixir & Phoenix was driven by this consideration as well. Most of the pages are well under 100 kilobytes and load in less than 100 milliseconds5. I find it's always helpful to keep performance in the back of your mind when building something.

Once you start to scale, the bottleneck is rarely the application layer. For the typical crud web app it's likely to be the database.

2 comments

Not sure if it's relevant but you can avoid having elapsed time in the formula and instead inflate the scores of the newer stories (https://news.ycombinator.com/item?id=9889750)

Then you only have to recalculate when there is a submission or vote.

> Once you start to scale, the bottleneck is rarely the application layer. For the typical crud web app it's likely to be the database.

You seem to be downplaying the importance of an efficient app layer and I respectfully disagree.

1. Using well-optimized tech for the app sends you right at the DB scaling problems phase, so you never go through the "our app layer is too damn slow" phase -- which can kill a startup pretty damn quick (and history knows examples). So, it's a huge win in my eyes.

2. Most apps never even take off to the point where the DB is the bottleneck so a strong app layer is a godsend; I'd estimate 80% of the apps struggle with optimizing backends and caches and not DBs -- at least judging by my 15 years of career which, I admit, is anecdotal evidence and doesn't mean much.

3. Those apps that do scale that far are very grateful to their backend engineers who spared them all the "tune our Java / Ruby / PHP / Python VM's memory usage and GC behaviour parameters" dance (and that dance can last for weeks and weeks almost without sleep!).