Hacker News new | ask | show | jobs
by nerdywordy 2792 days ago
I for one have always liked the simplicity of DO for hosting but I've never wanted to take on the full liability of self-rolling a DB server (and backups and replicas). So everything my company has I've put on heroku or azure. This has potential to be really significant as I'd wager there are a lot of folks in similar situations.
4 comments

I am definitely on that particular boat. I'll add one more point:

- Not very willing to setup DBs for several projects both paid and hobby because it's a fixed time sink. And before anybody tells me "but with this script it can take 2 minutes!" please don't forget that to learn to use your magical script I have to learn a few other things beforehand. (Although admittedly that's most likely a small time investment.)

For me it's mainly because I'm hopping around from database to database depending on the project and I can't keep everything in my head.

The more "best practices" I can automate the better since it reduces cognitive load on my poor noggin.

Agreed. It's why I think us the programmers should eventually just settle for 5-10 languages globally and not touch anything else -- so [together with all the other problems that still exist] we can also finally get to writing the one UltimateDataMapperâ„¢ library that can work with whatever is out there.

I seriously can't be bothered to setup yet another cool and young database promising me quantum entanglement teleportation anymore. It's what is stopping me from trying 99.9% of what I see on the net.

I have a few scripts to which I just pass a DB name / user / pass and it brings me up (or tears down) a Postgres or MySQL/Maria database. I'd do the same for Elastic and a few others if I wasn't so lazy about it for years now.

Even if you do have magic scripts and understanding for every possible scenario, you still have to deal with getting woken up to ascertain which scenario(s) you are in and run them.
Yep. One thing that the managed services give you is exactly that peace of mind you mention. Plus the fact that they are much better at fine tuning security, availability and performance settings than myself.
If you have a relatively small set of users, setting up your own database is usually as simple as setting it up locally and you won't need shards or anything. And setting up backups is as simple as adding a cron job that calls your backup shell script, which you can test separately. And by "small set of users", consider what SQLite's own website[1] says:

    Generally speaking, any site that gets fewer than 100K
    hits/day should work fine with SQLite. The 100K hits/day
    figure is a conservative estimate, not a hard upper bound.
    SQLite has been demonstrated to work with 10 times that
    amount of traffic.
If SQLite is able to comfortably handle 100k hits/day, I imagine that more "legitimate" databases can handle more traffic comfortably without needing to jump to scale horizontally.

[1] https://www.sqlite.org/whentouse.html under "Websites" section

The real benefit to having someone else manage your DB is that it eliminates the "unknown unknowns." I don't want to spend the requisite time becoming an expert DB sysadmin--I'd rather let someone else do it so that I can sleep at night. Also, databases are in a different category of risk. Misconfigure an nginx config? No big deal, fix it and move on. Set up your database incorrectly, resulting in data loss down the road? Could be game over.
SQLite doesn't really have concepts like replication (HA) or concurrent writers.

Notably, the SQLite website is (as far as I can see) read-only. So it's great if all you need is a SQL read API atop your structured data (and 100k hits/day is probably only limited by the filesystem/os since SQLite isn't a server). But you're setting yourself up for headaches by using SQLite if you need simultaneous read/writes combined with HA.

For small user counts, performance is the easy part. Failover and point-in-time restores are common examples for me that contain easily overlooked details and you don't find out until the worst possible time.

I think some cloud stuff is overpriced, but RDS easily pays for itself in my case.

Agree sqlite is great and >80% of websites will probably run fine on it, but 100K hits/day is pretty vague, does that mean 1 hit/sec or 3 hits/sec during peak time, etc...?
The next paragraph gives a little more context:

    The SQLite website (https://www.sqlite.org/) uses SQLite
    itself, of course, and as of this writing (2015) it handles
    about 400K to 500K HTTP requests per day, about 15-20% of
    which are dynamic pages touching the database. Dynamic
    content uses about 200 SQL statements per webpage. This
    setup runs on a single VM that shares a physical server
    with 23 others and yet still keeps the load average below
    0.1 most of the time.
Even if that's clarified, it's vague. It doesn't entail how a hit translates to database operations.

That said, I think it's more meant to be an anecdotal rule of thumb to tell people "you're not Google, SQLite will work for most teams".

It also doesn't specify a use-case. In a 98% read scenario with a good caching strategy it can easily do much more than 100k visitors per day. If you're taking in data from many devices you can easily bottleneck on writes.

It really depends. Also, configuring everything right gets hard. Most don't even think to do RAID over a few block storage devices, but that's something that comes with cloud storage. That doesn't count HA and other issues before getting to the application layer.

It's something that unless you're paying a full-time DBA, you are probably better off buying as a service. It's one of the few holes in DO's offerings and I'm very happy to see this.

I was literally going to spend the weekend testing latency between the US DO data centers and VMs on Azure and AWS just to see if any were pretty reasonable (consistently under 10ms) so I could use DO for my application and Azure or AWS for the DB hosting and management. This is incredibly great timing.
...then why not just use AWS's Lightsail instead of DO so you can use AWS RDS without worrying about latency? (Or the equivalent for Azure or GCP.)

Really asking, bc for next personal projects I was getting ready to abandon DO and use AWS Lightsail + RDS.

Because spending $25-50/month on side projects that may sit for a while before I pick it up again is one thing. Spending hundreds a month is another. DO is significantly faster at any price point than what you get from lightsail or aws-ec2. Beyond that, I'd be more inclined to use Azure's data services simply because I like the interfaces better and far less hassle to get started with. I'd probably use SQL minimally and lean on AWS Tables quite a bit.

For a DO only solution, I'd probably use SQL more and then rely on their blob (s3 compatible) storage for some bits. I have a few small projects that I've been getting anxious to finally work on, and don't want to spend a bunch of money on them in the interim.

In case of modest growth, I also don't want to be hamstrung doing DB operations work for something that isn't actually making me money... I'll split things up to save a bit in the nearer term so long as I can have a migration path.

If I did split with data in azure/aws and the apps in DO, I might go all azure later, or I might go all DO and take on the DB operations side... it depends on if/how things grow.

DO is cheaper for the servers. AWS is nice, but expensive. If you have the money, use it. If you are bootstrapping or willing to take on challenges yourself, DO like environments will save you a lot.
...yeah, but if best value for money would be the absolute goal, DO itself is quite expensive compared to Hetzner or Linode last time I compared. Others like OVH or Scaleway could be even cheaper.
DigitalOcean and Linode are currently the same price, what $5 buys you on DO gets you the same specs on Linode.

Hetzner is not a reasonable choice for servers IMO, its akin to hosting in a datahole in Dallas, expect mixed bandwidth quality and questionable policies when issues arise. Comparatively, OVH looks stable.

Scaleway doesn't take abuse reports seriously FYI, these attacks are still primarily coming from IPs announced by their ASN, over a year after this article was written: https://badpackets.net/ongoing-large-scale-sip-attack-campai...

> Hetzner is not a reasonable choice for servers IMO, its akin to hosting in a datahole in Dallas, expect mixed bandwidth quality and questionable policies when issues arise.

Really surprised to hear that, can you please elaborate? I only heard good things about them up until now...

> Comparatively, OVH looks stable.

What kind of stable do you mean? Bandwidth, latency, average I/O ops, CPU load?

That heavily depends on where you're operating from. Ignoring the latency problems of using Hetzner et al. for a moment (if you're based on the US). Increasingly as the Internet fractures down lines of very distinctly separate legal structures nation to nation (or region to region), Hetzner, OVH and Scaleway are not going to be viable choices for most organizations in the US. Particularly as it pertains to production environments and until or unless they get proper US facilities.

If I physically operate in the US and I base my servers in the EU, then I open myself up to not only US but also EU jurisdiction and compliance in a myriad of ways. It's an entirely unnecesary additional burden in exchange for a discount on infrastructure (which is rarely the biggest cost in anything these days).

I have no intention of ever complying with GDPR for example, unless I'm running a very large organization. Not because I disagree with most of GDPR, rather, because I'm going to comply with US laws, as that's my legal jurisdiction and those are the laws I'm governed by.

Hosting with Scaleway, OVH or Hetzner is a big jurisdiction mistake in most cases for smaller US organizations, just as it would be to arbitrarily host in Japan or China or Brazil (ie foreign locales with entirely different laws).

...being physically in the EU, but building stuff that has the potential to have 80% of the customers in US, as long as traffic to end-users in US is good (it usually is unless you care about low latency for gaming, or real-time-video bandwith for video chat), I'd be in the opposite camp and see no reason to pick US-only hosts (DO has Amsterdam datacenters though, and AWS or Azure also have).

For everyone EXCEPT the US-based businesses, being multi-juristiction by default form the get-go is the default, you know. And for any small or freshly created projects GDPR compliance is pretty easy though. EU's new copyright laws though... those are an abomination, hope it changes before they start being enforced. Nowadays EU and US are probably equally horrible and competing at being the most horrible with respect to restricting internet freedoms.

What I'm actually looking for is hosting services that are outside of BOTH US and EU for some more side project ideas that risk falling on the wrong side of IP laws (US's DMCA and all are horrible too btw...). Something that would be both run by a non-US and non-EU company and with datacenters physically outside this space. Something in Middle-East, SE-Asia or Russia could have decent bandwith tot he rest of the civilized world and at the same time be blessed with the capability to delay/ignore/missfile etc. requests from US and EU authorities, giving you a time buffer to damage control if s really hits the fan, while serving end users in those regions. Maybe after Brexit even the UK could become a nice place with more freedom too.

OVH has a montreal datacentre... 9.46ms in additional latency from nyc
I wrote a review (https://ayesh.me/amazon-lightsail-review) of Lightsail when it came out. Although the specs are ok in paper, their Network is slow. I still would like to switch to Lightsail because I already use Route53 and CloudFront, but I wouldn't go with them for their Network speeds.
The performance on Lightsail is bad compared to DO - CPU, network, disk access are all much slower at similar price levels.
Can you explain it more? Why you think that running your own DB instance is such overhead? Are you ever tried running your own, with mysql/mariadb for example?
Mostly because databases are the key piece o data-persistence infrastructure. Spinning up a MySQL db to dev against, or a single server for a hobby project is quick and easy.

In production, all of a sudden you have a lot of work to do, especially around HA. Figure out replication, get it working, figure out how to monitor/alert if it stops working, figure out failover, figure out how to test that failover actually works, etc.

Support around that stuff has improved over the years, but it's still non-trivial and high-risk to DIY. It's a very different scenario than a stateless app server where you can have easy redundancy.

For me it isn't even this complicated. I just don't want to have to manage OS/MySQL/Postgres version upgrades or worry about having to troubleshoot when things on the server go south.

Just give me a database to connect to and take my money, please.

I think you mean, shut up and take my money :)
timdev2 sums up my original sentiment quite well. Is spinning up and maintaining a DB architecture doable? Of course... But there are so many complexities involved for real production that it would greatly slow us down.

If we had staff to dedicate directly to this then it wouldn't be an issue. But paying for a managed service that gives us production grade data access is a no-brainer for any non-trivial application we build.

and unless you're one of the elite few for whom good HA is actually easy, your attempts will likely end up with something that you _believe_ is as available as RDS
For a lot of situations, MySQL with backups and a hot standby is fine.
It's not about when things are going well... anyone here can setup a DB instance and run against it. It's failover/HA or recovery options that are not considered by most. Not everyone can afford to take a half a day or more to setup for backups, or read mirrors, or failover, other clustering options. Not to mention actual recovery modes.

I'm happy to pay a few dollars a month for someone else to automate.

It's not just the database itself, it's everything else you have to do to do it right. How will you do backups and replication? How will you recover? What are the ideal configuration settings for your particular situation? What about authentication, roles, open ports, allowed ip address ranges, etc.
Yep, that's my main motivation for not even attempting it. There are a ton of security settings alone. I am not gonna be the idiot who will let a 14-year old bored script kiddie into my VPS-es so they can have a laugh at my expense.

I prefer the managed services. Save for the high availability you get tons of best security practices -- all for a few bucks a month. It's usually a such a crazy cheap deal it's almost not fair to them.

Running your own DB is a recipe for disaster UNLESS you know what you are doing and can invest resources continuously in keeping it up. If you have a good DBA, it probably isn't too much of hassle as long as you scale up with the number of DBs and you have automation to help along the way. However, at that point you are almost to a hosted solution anyway.

For hobby projects, a single self-managed instance is fine. For production in a business critical environment, much more thought needs to go in.