Hacker News new | ask | show | jobs
by sdegutis 2793 days ago
If you have a relatively small set of users, setting up your own database is usually as simple as setting it up locally and you won't need shards or anything. And setting up backups is as simple as adding a cron job that calls your backup shell script, which you can test separately. And by "small set of users", consider what SQLite's own website[1] says:

    Generally speaking, any site that gets fewer than 100K
    hits/day should work fine with SQLite. The 100K hits/day
    figure is a conservative estimate, not a hard upper bound.
    SQLite has been demonstrated to work with 10 times that
    amount of traffic.
If SQLite is able to comfortably handle 100k hits/day, I imagine that more "legitimate" databases can handle more traffic comfortably without needing to jump to scale horizontally.

[1] https://www.sqlite.org/whentouse.html under "Websites" section

4 comments

The real benefit to having someone else manage your DB is that it eliminates the "unknown unknowns." I don't want to spend the requisite time becoming an expert DB sysadmin--I'd rather let someone else do it so that I can sleep at night. Also, databases are in a different category of risk. Misconfigure an nginx config? No big deal, fix it and move on. Set up your database incorrectly, resulting in data loss down the road? Could be game over.
SQLite doesn't really have concepts like replication (HA) or concurrent writers.

Notably, the SQLite website is (as far as I can see) read-only. So it's great if all you need is a SQL read API atop your structured data (and 100k hits/day is probably only limited by the filesystem/os since SQLite isn't a server). But you're setting yourself up for headaches by using SQLite if you need simultaneous read/writes combined with HA.

For small user counts, performance is the easy part. Failover and point-in-time restores are common examples for me that contain easily overlooked details and you don't find out until the worst possible time.

I think some cloud stuff is overpriced, but RDS easily pays for itself in my case.

Agree sqlite is great and >80% of websites will probably run fine on it, but 100K hits/day is pretty vague, does that mean 1 hit/sec or 3 hits/sec during peak time, etc...?
The next paragraph gives a little more context:

    The SQLite website (https://www.sqlite.org/) uses SQLite
    itself, of course, and as of this writing (2015) it handles
    about 400K to 500K HTTP requests per day, about 15-20% of
    which are dynamic pages touching the database. Dynamic
    content uses about 200 SQL statements per webpage. This
    setup runs on a single VM that shares a physical server
    with 23 others and yet still keeps the load average below
    0.1 most of the time.
Even if that's clarified, it's vague. It doesn't entail how a hit translates to database operations.

That said, I think it's more meant to be an anecdotal rule of thumb to tell people "you're not Google, SQLite will work for most teams".

It also doesn't specify a use-case. In a 98% read scenario with a good caching strategy it can easily do much more than 100k visitors per day. If you're taking in data from many devices you can easily bottleneck on writes.

It really depends. Also, configuring everything right gets hard. Most don't even think to do RAID over a few block storage devices, but that's something that comes with cloud storage. That doesn't count HA and other issues before getting to the application layer.

It's something that unless you're paying a full-time DBA, you are probably better off buying as a service. It's one of the few holes in DO's offerings and I'm very happy to see this.