Hacker News new | ask | show | jobs
by syn0byte 2537 days ago
While everyone haggles about the internals I have an interesting anecdote about the costs of tools libraries.

A small service that landed in my lap needed to read(only) a data source that was roughly 10k lines of yaml. No way that was going to be in any way efficient so I asked for suggestions. All the work-a-day devs(I am not) instantly said the same thing without a single real thought about it: Make it a database duh!

Long story slightly less long, Loading up the libraries to interface with a database ate between 2 and 3 times the memory(depending on the DB and lib) that simply loading the entire 10k line yml ate and offered slower performance and required more code.

SQLite was pretty darn close but in the interest of saving developers from themselves vis a vis parameterized queries, or the need to queries all together for that matter, increased the required code for zero benefit.

The service still hums along with a 10k line yaml in memory. "Worse is better" indeed.

3 comments

You could of improved it with a protobuf or a json file then, but with the size of your data set, it shouldn't really matter what your using.

It can be hard to beat an in memory data structure when your data set is small enough, true.

Well, the SQLite might have been more scalable if you needed to deal with much larger YAML files. I've been in a similar situation with XML. It's fine to hold in memory until you end up running out of memory, at which point you need to look at other approaches. There's definitely an overhead to changing, but it might be worth paying if the tradeoffs make sense.
The quickest change is no change, naturally.

However, "better" is dependent on the situation. If the data is unlikely to change, stay the course. If not, then while a database might be more maintainable over the long term, even if not as efficient.