Hacker News new | ask | show | jobs
by btown 3551 days ago
Event sourcing isn't nearly as common knowledge among new programmers as the CRUD-one-row-per-entity pattern, and it really should be. I liken it to introducing version control for your data; when immutable updates are your canonical source, no matter how much the system behind them changes, or the business requirements change, and no matter how many teams are deriving different things from them in parallel, they can all work off of the same data and "merge" their efforts together.

The one downside is that shifting your business logic to read-time means that you need to have very efficient ways of accessing and memoizing derived data. For some applications, this can be as simple as having the correct database indices over your WhateverUpdates tables, fetching all updates into memory and merging on each request. For others, you'll need to have a real-time stream processing pipeline to preemptively get your derived data into the right shape into a cache. And those are more moving parts than your typical monolith app, but the

One benefit to actually using event sourcing with a stream processing system is that, in many cases, it can be the most effective way to scale both traffic capacity and organizational bandwidth, much in the same way that individually scalable microservices can (and fully compatible with that approach!). Martin Kleppman at Confluent (a LinkedIn spinoff creating and consulting on stream processing systems) writes some great and highly-approachable articles about this. Highly recommended reading.

http://www.confluent.io/blog/making-sense-of-stream-processi...

http://www.confluent.io/blog/turning-the-database-inside-out...

1 comments

The CRUD one-row-per-pattern is common because it's enough for most projects. It works well with ORMs so you can build quickly and securely. And most of the time, performance isn't an issue and having a history of an entity is unnecessary.

I'm worried that event sourcing is going to become this year's over-applied design pattern with libraries in every language for every database with blog posts that recommend it be used on every project.

It's a good idea, very useful - in the right hands on the right projects. But it makes sense that junior devs normally use CRUD because that's normally the right solution. At least until better tools come along.

> The CRUD one-row-per-pattern is common because it's enough for most projects. It works well with ORMs so you can build quickly and securely.

If by "works well", you mean it works until someone asks for historical data - then IT guy has to say w/ a straight face "we lost it". This is unacceptable considering the value of data and the strategic leverage it can have today.

Considering immutable facts tables are the most stable data model; companies often have to re-invent it (poorly) on top of relational at some point; that storage is often not a problem; and that having clean historical data is crucial for data science; there are increasingly fewer excuses to not adopt a sane data model from day one.

I agree partially w.r.t. to tooling - few implementations aid adopting this pattern, but I believe the value of historical data, over time, overcomes not being able to slap some quick Rail CRUD together and then being stuck at local minima.

>If by "works well", you mean it works until someone asks for historical data - then IT guy has to say w/ a straight face "we lost it". This is unacceptable

You'd be surprised.

For tons of projects it's totally acceptable, has worked for years, nobody paying to implement them cares about historical data and their leverage. In fact the majority of web apps is like this.

I always find it strange when people use "unacceptable" with wild abandon, like they're generals receiving some demand of unconditional surrender.

I've probably implemented 300 CRUD projects, and I've never had a single angry person complaining about lack of access to historical data.

* I keep hourly backups of mysqldumps which can be restored in the case of catastrophic mistakes.

* I explain that this kind of data collection would be out of scope, and require significantly more budget.

Just because it's technically possible, it doesn't mean you need to do it. YAGNI.

The full sentence is:

> This is unacceptable considering the value of data and the strategic leverage it can have today.

The last part is important.

Just because it's been true in the past, it doesn't mean this trend will continue. Maybe it keeps being true for your run-of-the-mill MVP, but I don't see it being acceptable for a system in any industry w/ any chance of making serious money in the mid-term.

As long as managers have limited budgets and projects have deadlines, then tradeoffs will still have to be made.

Event sourcing is an extremely expensive design pattern to implement, and it's also very easy to get wrong. Implementing it tends to preclude junior developers from working on the project, makes it harder for database admins to understand the data, and it requires a lot of thought on how to structure the events.

So on a project with, say, a £20K budget, it might triple the cost. On a project that would take 4 weeks to implement with CRUD, it might take 3 months with event sourcing. You've got to justify that extra cost. It's better to let a BA decide what they will need, and by all means explain the pros and cons of different solutions.

But I don't for a second believe that every single project should now be using event sourcing instead of CRUD.

>Just because it's been true in the past, it doesn't mean this trend will continue. Maybe it keeps being true for your run-of-the-mill MVP, but I don't see it being acceptable for a system in any industry w/ any chance of making serious money in the mid-term.

Again, you'd be surprised. Aside from advertising and consumer behavior analysis, there are not many industries that care or even have a need for such historical data.

The very idea that it must be necessarily valuable to store historical data about "all the things" (apart maybe from some aggregations you can create and store), seems more associated with the recent "big data" fad.

(I've seen 10-12 such trends rise and fall in the industry. In 10 years, I guarantee you it will have fallen off as a keyword, and only used as a technology where really appropriate).

Could happen, but I think event sourcing (and CQRS generally) carries enough implementation overhead in the amount of code required that it's less likely to be adopted in situations where it isn't appropriate.

That isn't to say it won't happen, but I think it's more likely that teams would miss an opportunity to leverage it than leverage it inappropriately.