Hacker News new | ask | show | jobs
by aristus 6426 days ago
When you are working on big problems, it's sometimes easy to let yourself get stuck on some unimportant decision. Usually it's a sign that you are unsure of something more important but you don't want to think about it.

If you just want to run an experiment on 10M pages, then use whatever you feel comfortable with. The important thing is NOT files vs sql but whether your classification idea is worth spending time on. Who cares if it's inefficient? That's not what your experiment is about.