Hacker News new | ask | show | jobs
by nostromo 4569 days ago
The disconnect between your comment and the article is the term "startup" now means giant companies like Airbnb and tiny two person companies that haven't yet created an MVP.

I think this article is targeted at the latter: pre-MVP and just post-MVP. For those startups, having two databases with one dedicated to a backend analytics system reeks of premature optimization.

4 comments

Having taken this path from a 2-person running a "nobody cares about this" app to an app with some decent traction, there are always other things more worthwhile to do in the early stages than trying to get insights from the invariably small amounts of data in the DB. A 2-person company should be talking to users rather than trying to analyze patterns from database tables. Sample size is just too small, and you will be evolving so fast that the trends will almost be meaningless.

If you grow to be more than 2-people, then taking a mirror dump of the prod db to run queries against is pretty trivial effort.

Think of it this way: You really ought to have a failove/disaster recovery copy of your database, and an easy way of making offsite backups.

Set up a slave of your MySQL or Postgres database (no the slave is not a backup, but dumping it regularly is an easy first step to a very basic backup setup), hosted in a different data centre, and if you're a two person company you now also have somewhere to run analytics whenever you feel ready.

It can be <1 hour effort and a few tens of dollars a month in extra costs for a small system, yet makes a tremendous difference in resilience and gives you that db to run analytics against "for free" whenever you do need it.

"having two databases with one dedicated to a backend analytics system reeks of premature optimization."

Its free software, you don't have to pay for two instances of Oracle.

One thing that will quickly kill a biz is combining the functions of PROD and DEV/TEST. Making the DEV/TEST box the DEV/TEST/REPORTS box is not a big deal, and you can't run a (real) biz without a DEV/TEST box.

Having run the technical side of ~5 "small" businesses now (no more than $1.5M revenue), I disagree.

Eventually, nothing can match the performance of storing binary blobs on a cluster. But that only becomes worthwhile if you database is significantly larger than a terabyte. And I'm only talking about the operational "core" database, not your "data warehouse" (the log dumping ground, which should be split off when your database gets to be a few dozen gigs).

Meanwhile, mysql has big advantages :

1) can do basic optimization with "ALTER TABLE", even (mostly) live.

2) you can mix PROD and DEV/TEST (though obviously you need to use good judgement). Obviously you should also have a DEV/TEST instance for actual testing. Sometimes you want to run a test quickly against PROD though. Adding a slave, having it sync and then running against the slave is a joy.

3) creating reports is quick, customizable and everything you want.

4) It's "idiot-friendly". Employees can ramp up to the structure in a mysql db in 2 weeks flat. Try that with custom document stores.

5) It's typesafe and relational safe (if correctly designed), with the advantages that brings : significantly less weirdness in the database.

6) Phpmyadmin. Mysql workbench. Django. Php ...

I'm even going to argue that the GP's argument, that running analytics on PROD can get you fired, is not just wrong, it's actually an advantage of using mysql. (And the open source SAP database can run "live" analytics. You just can't believe how great that is for dashboards)

True - but the backend analytics system can be Excel :) which it often is. In fact in my "two person startup" (that was alive happily for 5 years) this is exactly what I did :). I do take your point, however. I still think it's wise to not have an admin backend that runs rollups on your live system (which we've all done).