Hacker News new | ask | show | jobs
by GundersenM 3551 days ago
Sure, if the existing DB is simple, that is straight forward, but remember that likely this is a monolith that is so bad that even management have agreed that it needs to be rewritten. Likely there are lots of DB tables with foreign keys and relations (sometimes documented and enforced, most often not). This means you can't really convert the entire database into an event sourced system, as that means converting all of the tables in one single go, instead of a gradual change. And believe me, in a system like this you want slow gradual changes! Also, even of you got it into events, what happened to the domains? There are so many relations between the different events sources (because you didn't put everything into just one event source, right? What happened to bounding contexts?) that you are no better off. And this means you have to prevent anything else from using the database anymore, and in a legacy system where you can just join across any two or three tables to extract whatever information you want, you can be certain there are some analysis engines that are just feeding directly on the sql data. And there might be other systems writing to the database too!

So the first step is to disentangle all the data and encapsulate it, trying to prevent others from using it, so you have full control over it. This includes tracking down any other system using this data, and ensuring they too go through the database. And you have to do this for one subsystem at a time, often in several iterations.

2 comments

> Sure, if the existing DB is simple, that is straight forward, but remember that likely this is a monolith that is so bad that even management have agreed that it needs to be rewritten.

Yeah, but that's not a "converting legacy data to ES" problem, that's a "converting legacy data to any non-broken thing" problem.

> This means you can't really convert the entire database into an event sourced system, as that means converting all of the tables in one single go, instead of a gradual change.

Whether its ES or something else you are converting to, you either do a big-bang conversion and eat the pain of that (which can be tremendous, sure), or you instead eat the pain of taking the monolith and finding a way to break out components and do it incrementally, even though that takes not only building the new components, but reengineering parts of the old monolith to support that. Which, also, can be tremendous pain. But, again, this isn't really essentially tied to event sourcing, you face this dilemma even if you are going from a (broken for current needs, which is why it is being replaced) classically-designed "current state" RDBMS-backed system to a (meeting current needs, and hopefully more adaptable to future needS) classically-designed "current state" RDBMS-backed system.

Yup, agree, this is the problem of having a legacy monolith RDBMS that needs to be rewritten and split apart. It's tempting to throw every new fancy technology at the problem when that is suddenly an option, but it's better to focus on the goal of splitting it apart only. If you have split it out, and it's now simple to convert to ES CQRS, then you are probably in a situation where you don't need to do that, as it works quite well.
> It's tempting to throw every new fancy technology at the problem when that is suddenly an option, but it's better to focus on the goal of splitting it apart only.

Splitting it apart involves:

(1) Dividing the data and functionality into a legacy component and a new-implementation component,

(2) Making changes to the DB and application code for the legacy component,

(3) Implementing the new-implementation component.

In a monolith that you are breaking apart, the reusability of legacy code for the new-implementation component is likely to be low (you'll actually likely have to do extensive changes to the larger "legacy component" as well, but the reusability should be somewhat higher there.)

You have to use some technology for the new implementation component, and what you should aim for is whatever is the best fit for the job, whether it is similar to what existed before or not.

> If you have split it out, and it's now simple to convert to ES CQRS, then you are probably in a situation where you don't need to do that, as it works quite well.

I disagree. The hard part of converting to ES/CQRS for the components that are broken out ("new implementation" components, not the "legacy" reduced-monolith) is done in the analysis phase of what you are breaking out. Once that is done, implementation in a ES/CQRS manner is fairly straightforward, since defining the events that the component will handle is a core part of analysis, as is defining the impacts those events have on stored, reportable data (the query side of CQRS).

"The hard part [...] is done in the analysis phase..." smells like big design up front that is usually more likely to fail than not, especially so for complex systems.
Big design up front would be a complete system replacement, not incremental replacement by component. An incremental replacement still requires definition of the components to be replaced with new implementation and the part to be essentially retained with only the changes necessary to interface with the new component.
Sorry, replied too quickly. My point is that if things are so bad that they warrant a major rewrite, then they are probably so bad that there is no simple way to map the existing data into events or starting conditions. It might be true if you have a simple well defined silo of a system that does one thing well, but not of you have what the author described, a monolith that does several things in the same codebase.