Hacker News new | ask | show | jobs
by nawgszy 2173 days ago
Well, this was maybe a bit entry level description.

Q: "databases are often the worst-performing part of the tech stack" - compared to what? nginx throughput? I find this to be a bit of a strange view, surely business logic is always the slowest part of the tech stack

Q: "What happens if that int should actually be a float?" - how often do you actually need to run migrations versus just extending the data? From my end, I have a small idempotent database schema-maintaining tool, and if I need a new column or a new table there's no need for a migration, and you know your whole stack will interact with the new schema or old schema identically assuming you set sane defaults etc. I've built a lot of medium-quality low-traffic apps so I'm yet to encounter a real-world case where a migration wasn't just bad planning

1 comments

>I'm yet to encounter a real-world case where a migration wasn't just bad planning

That's basically the root of the issue. Poor planning in your code means you re-write some code. Poor planning in your database means you have to start restructuring data, and if it's already running in production you have to hope you don't accidentally corrupt production data. It's a lot harder to restore corrupted data in production than it is to roll back a code deployment. And the answer to the problem is obviously just spending more time thinking about the proper data structure, which is the entirety of my complaint: I want my data to fit my application, I don't want to have to write my application to fit my data. I don't want to have to think "does this field belong in the Users table or the Accounts table or the [insert table here]".

I'm not sure what you mean by "just extending the data"... if I'm writing a Rails app and I need to change an int to a float, the way I do that is by writing and executing a migration.

As for the speed... a database typically stores its data on disk and is often not hosted on the same physical machine as the web server. Meanwhile the app and web server store a lot of things in-memory on the local server and even when it has to read from disk, it's a local disk attached to that machine. Check these numbers for how long it takes to read from memory (or even local disk) versus reading a remote disk over the network: https://gist.github.com/jboner/2841832

> >If it's already running in production you have to hope you don't accidentally corrupt production data

Or you write use a read replica to transform your data into a non-live DB and validate it before you put it into production, with backups of your final old schema available? Plus I really don't think a migration is that hard. Much harder than having a litany of shitty backends you have to glue together in your front-end app, trust me as someone who's done both.

I mean, I hear you, persisting data is hard, but that's not the database's fault, it's because you pick two of three on data: performance, persistence, and flexibility

> a database typically

But that's not the "worst performing" part of the stack, that's the highest latency part of the stack. Is there some specific reason you can not have co-located web & db servers? Also, is there a reason you still reference 2012 disk numbers when SSDs clearly have reduced all "disk" operations by an order of magnitude?

> just extending the data

What I mean by this is if you start with a minimal amount of columns in your database, and someone is like "we need new property x", it's easy to add X outside a migration - adding new columns or new tables does not require migrations if you don't modify existing columns

So this is my approach. Use as few bits as possible to persist your data, and then you can generally add new features migration-free