Hacker News new | ask | show | jobs
by jafitc 598 days ago
I think you should consider trimming that file.

Exclude movies with very low number of rating or potentially very low scores too.

The long tail reduction would be significant

1 comments

I initially loved looking for obscure stuff, e.g. setting region to soviet union. It surely is the case that 99% of the users want 10% of the data at most. I'll have to work ability to select the file and download & cache it only if the relevant query is asking for it.