Hacker News new | ask | show | jobs
by lmm 4431 days ago
When I worked at last.fm code that assumed all the data would always be in a single postgres database was a constant source of pain - we spent a lot of time migrating tables out of the big central database, either because the data simply didn't fit any more, or because we wanted it to be available to a Hadoop job. (There were probably other reasons, but those are the ones I remember). Maybe last.fm's an extreme case, but it does happen.
1 comments

wouldn't a single abstraction over postgres, filesystem, hadoop, etc be either really leaky or really inefficient? different datastores are better suited for certain kinds of queries. It seems like the programmer should be aware of what he/she is querying.
You invert the dependency. The abstraction is over the things that the higher level code needs. I don't need to know about query types, indexes etc. I need some business answer (all log records between x/y, a user matching username x), I program to an interface that provides all the answers necessary for the high level code.

The implementation of that interface is data store aware and implements the interface in the most effective way possible for the data store holding the things I'm interested in.