Hacker News new | ask | show | jobs
by michaelmcmillan 1906 days ago
I've built a complex CRM that handles 2.1 million USD in transactions every year. It is running sqlite with a simple in-memory lru cache (just a dict) that gets purged when a mutating query (INSERT, UPDATE or DELETE) is executed. It is very simple and more than fast enough.

Friendly reminder that you shouldn't spend time fine tuning your horizontal autoscaler in k8s before making money.

3 comments

Do you mean I don't need Go microservices talking gRPC deployed in multiple kubernetes clusters with bash script based migrations via GitOps with my hand made multi cloud automation (in case we move clouds) following all the SCRUM practices to ship working software?

Mindblowing.

You will for your blog series that you mention prominently on your resume that gets you your next Senior Architect gig.
Agh!... that's the catch... what I'm gonna give talks about and what do I write on medium then.... now I get it. Thanks!.
> my hand made multi cloud automation (in case we move clouds)

Aren't these symptoms of a deeper problem? Many Product Manager I talk to wanted me to build something that is as flexible as possible and solves all problems for everyone, everywhere. Go microservices with gRPC and Kubernetes feels like the only high-level technical decisions I can take in light of such information. :)

> Friendly reminder that you shouldn't spend time fine tuning your horizontal autoscaler in k8s

Oh, but most companies need this!

The biggest feat of microservices (which require ways to manage them, like k8s) was to provide the ability for companies to ship their organizational chart to production.

If you don't need to ship your org chart and you can focus on designing a product, then you can go a long way without overly complicating your architecture.

I think this org chart issue has been overstated.

If you have two services that are completely orthogonal, then combining them into a single application for deployment and operation purposes can be limiting. Developers of all people should understand the benefits of decoupling.

There are a lot of downsides that come with jamming a whole lot of unrelated functionality into the same deployment unit. It's similar to the problems created by global variables. Once you start to depend on your services operating in a monolith, you can easily create rigidity that can be difficult to roll back.

It's not that you can't design a monolith well, but microservices force you to consider important boundaries as opposed to simply violating them for the sake of convenience. It's another "human" issue, but it's unrelated to org charts.

This doesn't mean that everything should be a microservice, though. Good architecures address the requirements of the systems being developed.

I think the deployment benefits have been even more overstated.

This is an absolutely enormous trade-off between logical modularity and run time operational complexity.

If your devs aren't good enough to enforce modular design in a single application, what makes you think they're good enough to handle complex distributed systems?

> If your devs aren't good enough to enforce modular design in a single application, what makes you think they're good enough to handle complex distributed systems?

That's a simplistic take on the reasons that monoliths tend towards breaking modularity.

What makes me think managing microservices is perfectly viable is that the last three companies I've been at have been able to do it successfully, with a typical mix of good and less good developers, including cheap offshore devs.

As for handling "complex distributed systems," in many cases all that's really needed is something like a managed container platform. Services like Fargate or Cloud Run, or managed Kubernetes, can do a good job of this. Developers can deploy and publish new services with minimal effort, and most of the operational complexity is managed by the platform.

You do ideally want someone paying attention to overall architecture to avoid obvious pitfalls, such as effectively doing distributed joins via REST calls, and so on. This isn't that hard to understand, though, and teams that don't do the necessary architecture upfront tend to figure it out once they run into those problems themselves.

How do you ensure data is not lost to oblivion if a catastrophic system failure occurs?
You put in-place a loss mitigation strategy. This strategy will vary by application. In my case, I have a similar setup where we write 25-30k records to SQLite daily. We start each day fresh with a new SQLite db file (named yyyy-mm-dd.db) and back it up to AWS S3 daily under the scheme /app_name/data/year/month/file. You could say that's 9 million records a year or 365 mini-sqlite dbs containing 25-30k records. Portability is another awesome trait of SQLite. Then, at the end of the week (after 7 days that is), we use AWS Glue (PySpark specifically) to process these weekly database files and create a Parquet (snappy compression) file which is then imported into Clickhouse for analytics and reporting.

At any given point in time, we retain 7 years worth of files in S3. That's approx. 2275 files for under $10/month. Anything older, is archived into AWS Glacier...all while the data is still accessible within Clickhouse. As of right now, we have 12 years worth of data. Hope it helps!

This sounds interesting. Have you thought of doing a talk or blog article about it?

p.s., I run the SF Bay Area ClickHouse meetup. Sounds like an interesting topic for a future meeting. https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Mee...

I'd be interested in hearing more about this design.
Backups.
Does that mean it's okay for your application to loose transactions (which occured between the backup point and the failure point) or do you have other mitigations ?
I'm the author of Litestream, which is an open-source tool for streaming replication for SQLite. That could be a good option if you need to limit your window for data loss. We have a pretty active Slack if you need help getting up and running. https://litestream.io/
Guess this daily 2 seconds of downtime is worth it, when that reduces cost say from $2000/month to $20/month.
Many banks still “shutdown” for hours every night to do backups.
I’m not anywhere near the banking industry but from HN alone I’ve been led to believe dailyish huge file transfers are also the norm in a variety of situations (aka SQLite’s backup strategy).
Isn't that how all backups work? If you need to prevent data loss then backups probably aren't your tool of choice. And if you're paranoid about data loss then any replication lag is also unacceptable.

* I'm worried about my server blowing up: Transactions have to be committed to more than one DB on separate physical hosts before returning.

* I'm worried about my datacenter blowing up: Transactions have to be committed to more than one DB in more than one DC before returning.