Hacker News new | ask | show | jobs
by phtrivier 777 days ago
> (Really, though, it doesn't matter. Your software stack is almost certainly not going to decide the life or death of your business.)

This is the part that's going to sting.

I suspect, given the general admiration for Paul Graham on YC, many people subscribe (if only unconsciously, and at least to some degree) to the idea that using techno X (where X would be Common Lisp in the case of PG, but everyone will insert their own pet tech here) can _by itself_ make your startup successful.

Whereas the sad truth is that choosing the wrong tech can definitely _kill_ your shop, but choosing the "right" one will not ensure its survival...

3 comments

The key here though is that choosing a bleeding edge tech is more likely to be a problem because it's fairly untested.

I can tell you what ALL the problems with rails are. For the most part they won't even start to bite you until you hit scaling problems. By the time you hit scaling problems with Rails you can probably afford to pay engineers to solve the scaling problems, and/or port off at that point.

I love sveltekit and use it a bunch for my own personal projects, but it's too immature to recommend to others. Instead I mostly point them at Next.js if they want a javascript stack and Rails if they don't. I have created and maintained apps based on both platforms for 6 and 16 years respectively and know exactly what I'm recommending to people.

I was big on Rails scene when it was huge in 2008, and saw a big exodus from it. I found so many of the "Rails" problems were solved by learning two languages: ruby, and sql. People would go to crazy lengths to avoid looking at the queries they'd actually run. I can admit to not really learning ruby as a language by itself, and can now see how much better my old code would've been if I wasn't just blindly shoehorning Railscasts in there.

Similar problems for people who learned angular but not typescript, laravel but not php, etc.

When software engineers start a business, it's easy for them to hyperfixate on technical decisions because those are the types of problems they understand well enough to optimize.
Yep. The problem with the story of bikeshedding is that it assumes people know bike sheds but not nuclear reactors. There are plenty of cases of people who are great at nuclear reactor design but have no idea how workers should park their bicycles.
I've seen a really interesting pattern play out a few times now.

A startup needs to raise money and hitches onto the latest tech stack that grabs investors' attention. The startup raises on a valuation inflated based partly on the tech stack itself, meaning enough attention isn't given to the actual product or business model. Ultimately the startup runs into money problems when they can't live up to the tech stack hype and the valuation that went with it.

I saw this first hand with a startup picking Plaentscale early on. There's nothing wrong with the product and it solves certain problems really well, but I saw one particular startup grab it early because it was getting a lot of attention and completely missed that the limitations of Planet scale ran smack in the face of what the startup wanted to build.

what limitations? PlanetScale has public companies running on top of it. If a startup found limitations then its a skill issue.
This was when foreign keys still weren't supported by Planetscale. The specific data model was heavily relational and queries were very inefficient without foreign key support. Distributed writes were also important, and if I remember right those weren't supported either though that's a really common limitation.

Assuming a particular tool works for all situations because it works for some is a mistake though. Plenty of companies use all kinds of tools, they're picked to match the specific use case and there is no magic bullet.

> The specific data model was heavily relational and queries were very inefficient without foreign key support.

I can't help but wonder if you're conflating the notion foreign keys (and the usefulness of having them be indexed) and foreign key constraints, which ensure data integrity at the expense of write performance.

I have a narrow view of the performance of MySQL foreign key constraints and would be interested in learning of cases where they might actually improve certain queries.

I'm actually curious what the distinction is in your view. I've never really considered a column to be a foreign key when the constraint isn't used.

Having a column that we give business logic context to is useful, and indexing a column that should contain values for another table is helpful for query speed, but at least in my opinion they really aren't foreign keys unless that constraint lives directly in the database layer itself. I'd say the same for columns that are used as unique identifiers without actually adding unique constraints to the column.

There are good performance reasons to do either one if you're willing to take on the data integrity responsibility in the application code, but the column itself really is just a typed column if the constraints live elsewhere (again in my opinion, I think the technical definitions may ignore this functional argument).

Where I find foreign key constraints helpful for queries is when I need to be absolutely sure of the data integrity. Say I need to make a complex query that joins across three different tables based on foreign keys. If the table constraints exist I know that (a) any value in a foreign key column is valid and the referenced key exists and (b) if no rows are found I can trust that its just because none exist.

Without foreign key constraints, I may not know why the query didn't find any results. It could be because there just aren't any matching rows, it could also be that one of the keys is no longer valid (or never was). If I don't care about that second error state my query may not change much, but if I need to know why the query failed to match and handle any invalid data accordingly I couldn't do it.

When writing, I also much prefer having a single insert that I know will fail if the foreign key isn't valid. This could be done with a more complex query, or a transaction, but then I'd be taking on that responsibility when it could live directly in the db. Beyond the complexity there, I have to assume the database authors would be able to write a more efficient foreign key validation check them I could from my end.

That said, what's been your experience handling foreign keys when the constraint is either unsupported or unused? Do you avoid it mainly for the bump in query performance, and if so how do you avoid that performance hit elsewhere in your code?

Excluding multi-column keys and joins, I'd describe a column as a foreign key in the context of a given query; i.e. when that query references a unique column in a JOIN condition on a related table. This differs from an explicit declaration of a foreign key constraint which provides all the useful referential integrity characteristics that you eluded to. If you said to me "Database Foo doesn't support foreign keys!" I'd take that to mean that you couldn't perform queries with joins.

In my experience, working on systems where foreign key constraints are liberally applied has been a net negative. Certain classes of DML statements (ON DELETE CASCADE I remember as being infamous) are certainly worse in performance than they otherwise might be. As an administrator I remember being repeatedly and painfully hamstrung by the inability to make arbitrary DB writes, which may momentarily violate strict data integrity, but are necessary for immediate practical reasons.

Obviously data integrity suffers without explicit constraints, but I'd rather work to backfill and/or clean up messy data than deal with a frustratingly rigid and poor performing system. I've worked on a number of large-scale MySQL database deployments at various tech companies, and I can't recall many, if any, that required or possessed pristine referential integrity. I can see why it's conceptually compelling, and I appreciate how automated tooling can generate very useful entity relationship diagrams if FK relationships are explicitly spelled out by constraints.

I think the "performance hit elsewhere" only happens given the assumption that strict referential integrity is a requirement, perhaps in a banking context or another where messy data simply cannot be tolerated.

> If a startup found limitations then its a skill issue.

"If it didn't work, you must have done it wrong" is maybe the most toxic sentiment in tech and in life. The Agile Coach's mindset.