Hacker News new | ask | show | jobs
by shubhamjain 1055 days ago
I applaud Colin's (Tarsnap Founder) attitude. Sure it could be priced better. Sure it could make much, much more than it currently does. But I dislike the notion that every software company needs to optimize for the same things. Tarsnap is a service that I am sure makes comfortable amount of money for its founder and has remained faithful to its initial audience. Why does anything other than that matters? Yes, some things Patrick points out are indeed low-hanging fruits, but I believe it's a conscious decision to completely ignore all the "optimisation" aspects.

Tarsnap doesn't even have any tracker on its homepage. Tarsnap has had the same basic pricing structure for the past ten years. It does one thing and does it well. I hate the pursuit of growth and everything that comes as a result: bloat, shiny landing pages, a/b testing, conversation rate optimisation.

Reminds me of the adage of a Mexican fisherman.

> “Afterwards? Well my friend, that’s when it gets really interesting,” answered the tourist, laughing. “When your business gets really big, you can start buying and selling stocks and make millions!”

> “Millions? Really? And after that?” asked the fishermen.

> “After that you’ll be able to retire, live in a tiny village near the coast, sleep late, play with your children, catch a few fish, take a siesta with your wife and spend your evenings drinking and enjoying your friends.”

> “With all due respect sir, but that’s exactly what we are doing now. So what’s the point wasting twenty-five years?” asked the Mexicans.

2 comments

> It does one thing and does it well

From the posts I've read recently it seems like it does one thing and it does it by renting a single EC2 server that will bring the service down if it needs to reboot, and it does it by reselling S3 at 10x the cost.

It's funny because maybe it's a good service but going by HN, it's not reliable or cost effective.

> but going by HN, it's not reliable

That's a cheap shot, it's been as reliable as the underlying fabric, the only thing that really stood out for me is how utterly weird HN is when it comes to determining what constitutes reliability: no data was lost other than a tiny bit that was in inbound transit which can still be recovered (and which you could not realistically protect against). Note that this is a backup service and not something that is normally found in your primary business processes. As such if it stores the crown jewels safely, allows for them to be restored if and when needed and doesn't leak them in the meantime that's mission accomplished.

> or cost effective.

That depends on your use case, and not everything is about cost. The way it is set up I think the trust factor that even Colin can't read your data and that there will always be a way to get your data back out if you should need it is what matters. Backups that don't work are a net negative, a backup that does work can be, given the right circumstances, absolutely priceless.

Reliability is important for a backup service. If your machine explodes and you need to restore from backups, but the backup service is down, you need to wait and may lose money due to the outage (SLA, unhappy customers, no ability to onboard new customers etc.). If you’re doing weekly backups, but the backup service was down during the backup slot, and your crontab setup doesn’t yell at you and doesn’t retry until it succeeds, you might lose two weeks’ worth of data if disaster strikes.
Yes, reliability is important. And by that measure Tarsnap is 100% reliable. But not 100% available, and that's something that often gets confused. Having to wait while you are trying to restore a backup would be extremely annoying but that implies that you've done something wrong in your planning: if you expect your backup service to be 100% available then you are probably not engineering things right because for many reasons that might not be the case. Tarsnap does not promise 100% availability, and no other backup service that I'm aware of does. For instance, backblaze offers 11 (!) nines reliability but only 3 nines availability (which is pretty much expected).

If you want more than 3 nines availability neither Backblaze nor Tarsnap nor any other outside service would be able to serve your needs.

I think it's very hard to run a service by yourself of this magnitude reliably, but I'd always take a 99.9% availability daily backup service that runs right at SLO over one that's down for a day once in a blue moon.

Also, parent is talking about ingestion. If your backups aren't configured well and the backup process fails, then your backup may not end up durable.

I also don't think your definition of reliable is generally recognized, which I'd generally call durability. I wouldn't say the scenario above is a durability failure, but an example of the consequences of poor availability.

> I think it's very hard to run a service by yourself of this magnitude reliably, but I'd always take a 99.9% availability daily backup service that runs right at SLO over one that's down for a day once in a blue moon.

That's a fallacy right there. Your assumption should be that any service you rely on will be down once in a blue moon, and possibly for a day or even longer.

> Also, parent is talking about ingestion. If your backups aren't configured well and the backup process fails, then your backup may not end up durable.

Yes, indeed, you need to do your work and you don't get to point at others for not doing it right.

> I also don't think your definition of reliable is generally recognized, which I'd generally call durability.

Reliability, durability and availability are all industry terms and have very clear definitions. These are not the same definitions that you would use in ordinary conversation with laypeople but when we're talking shop those are definitely allowed.

> I wouldn't say the scenario above is a durability failure, but an example of the consequences of poor availability.

No, it is a consequence of poor engineering on the part of the user of the service, and is a completely different issue. You engineer your service to ensure that your assumptions hold true and if you fail at doing that your service will fail. When is then only a matter of time and combination of circumstances, but fail it will.

That's a funny definition of "reliable". I'd factor availability into reliability. If I Uber to work and every time an Uber picks me up it gets me to my destination with 100% success but once a week no Ubers are available, is that a reliable mode of transportation? Would my boss not shout at me to find a more reliable way to get to work?
Eric Brewer's calling, and would like a word.

Availability and correctness are fundamentally opposed. The word "reliable" is contextual.

A backup service that is always available but serves up garbage is not as reliable as one that serves me the correct data, but only on Mondays.

Sure, but if you took urber every day and after several years none was available for just one day your boss would forgive you and consider uber reliable. If it suddenly had a lot of failures you would be told to find a new way, but everyone has a few days per year they can't get to work(often sick)
You are talking about a service that appears to have had a single documented outage event over the span of eleven years.
> it does it by reselling S3 at 10x the cost.

Github resells a free product with a fancy UI. Stripe resells visa and mastercard by adding a 5x surcharge to card transactions. Steam resells stripe by adding a 30x markup on that (it doesn't, it uses worldpay but the point stands). Calendly resells an open calendar for $12/month.

This is a reductive argument that doesn't really show why people pay for services. Tarsnap doesn't resell S3 at a 10x markup, it sells a backup service for $0.25/GB/month.

That said,

> it does it by renting a single EC2 server that will bring the service down if it needs to reboot

Yeah, and honestly it's pretty unbelieveable that there's not _two_ servers.

> at 10x the cost

And that's comparing to S3 Standard. Infrequent Access is 2x cheaper than that, and Glacier Instant Retrieval 6x (if your files aren't tiny).

True enough, but you do get a bit more than just storage from tarsnap. S3 provides storage and an API for uploading things to it, but you can’t just back up your files to S3. You have to figure out what files have changed, compress them, encrypt them, and index them so you can retrieve them again later, etc, etc. That’s a non–trivial amount of software to write. It’d be really great if someone had already done so… Oh look, someone did!

I could imagine purchasing that software for a one–time price, running it myself, and paying AWS for storage. But then I’d have to monitor it, troubleshoot outages, maintain things, etc, etc. Or I could pay someone else to do all of that. I’m not currently a customer, but I know which I prefer.

Several people have written that software. Duplicati, Borg, Arq for example.

> But then I’d have to monitor it, troubleshoot outages, maintain things, etc, etc.

None of these solutions free you from having to monitor your backups, including Tarsnap. Tarnsap requires setup on your server. You have to make sure it's running and backing up the correct files. And you really should verify you can restore a backup.

I'm really not sure what Tarsnap adds over these aside from saving you from having to sign up for B2 to S3 and punching in an API key.

Tarsnap preforms asymmetric encryption which lets you perform automated backups without needing to enter any passwords (or otherwise storing your encryption passwords in plain text).

Tarsnap does full deduplication across all backups for any given "machine", while still letting you independently remove any snapshots you like. i.e. no special "full snapshot" that must always be kept around, and no need for multiple full snapshots that have no deduplication between them.

Restic backs up to s3 and it’s quite simple to setup and monitor.
> It's funny because maybe it's a good service but going by HN, it's not reliable or cost effective.

This is a very unfair take, based on basically nothing but the single recent outage report it seems. Tarsnap is generally liked by HN and if you use it, you will know why. Its a great service technically, and _extremely_ affordable. I was a happy user for years but have moved to local time machine backups with B2 offsite replication just because its seamlessly integrated into my NAS (and is also very affordable).

Except that’s not what being a fisherman is like at all
neither is it like that to be a multi millionaire. rare is the multimillionaire who "retires" to a village and enjoys siesta. usually, they want to move even more money (which is fine).
> which is fine

I disagree. I think the world would be a better place if many millionaires had decided to retire instead of trying to extract as much wealth as possible from others.