Hacker News new | ask | show | jobs
by logarhythm 1548 days ago
Your understanding of Polylith is essentially correct. It is a combination of existing ideas, such as monorepos, convention, components, interfaces, encapsulation, and static code analysis. However, I'd argue that what emerged in Polylith from those ingredients was more than the sum of its parts - due to how those particular concepts resonate with each other.

For example, because of the forced conventions, it's trivial for the Polylith Tool to perform static code analysis on a Polylith codebase. This means that the tool can identify the subset of tests to run, based on the components that have changed since the last run. Which leads to a fast-feedback loop, and encourages both good testing practices and fine-grained component modularity.

Polylith gives you a system-level architectural building block; the component, which encourages a modular design and separation of concerns. However, you're right that it's still possible to create spaghetti code with Polylith. All it would take is poorly designed components, with bad names, multiple reasons to change, and exposing their state everywhere. However, I'd argue that when you give someone a well designed tool (like Polylith), then they're more likely to craft a well designed product with it.

To understand how builds and deployments work with Polylith, I'd recommend reading the "Workflow" section in the Polylith Tool's documentation: https://polylith.gitbook.io/poly/workflow/shell (especially "Build", "Git", "Continuous Integration", and "Testing").

2 comments

It feels like what you're going to get out of using it is basically a set of conventions, best practices, rules and tooling support that make doing the right thing (for some specific opinions of right, but I think I share enough of the opinions to leave it at that caveat) the natural and obvious thing to do.

I'm sort of reminded of adopting a formatting and linting stack across a codebase - so long as it mostly makes mostly good choices, the shared conventions are usually a net win overall just because it (a) makes code more accessible across the team (b) gives you a solid default way to resolve a lot of choices.

The whole "can -reliably- figure out which tests need to be re-run" part specifically sounds like it would be a very nice thing to have.

I suspect your biggest challenge in terms of adoption will be the requirement for development group wide buy in to get the full benefits, but that's a problem that's kind of inherent to the goals you're trying to achieve here and so I shall simply wish you good luck with that part.

That's right, it's a highly opinionated approach to building software, which comes with a host of benefits.

Including my favourite; a complete untangling of your development and production environments. With Polylith you always develop your system as a monolith (because that's the most effective way to build software), but you're able to deploy it as multiple services (because that's sometimes the most effective way to run software). It turns out that separating deployment complexity from development complexity is a game-changer, and something that I haven't come across from other architectures.

It's true that you don't reap all the benefits of Polylith until your entire codebase uses the same structure, which feels a bit like "all or nothing". However, many of the benefits are unlocked "as you go", so even converting one or two existing microservices to Polylith will feel like a nicer codebase to work with.

> so I shall simply wish you good luck with that part

Thanks!

My understandimg of Polylith so far, in the context of "untangling development and production" is that you might put together components and bases differently in dev vs prod. If that's correct, untangling development and production seems to run counter to the idea of trying to match development and production as closely as possible (eg with containers to mimic environments better) to make testing more accurate/realistic. Based on my very limited experience, I don't really understand why you would want to drop dev/prod parity entirely.

Having said that, I'm sure it's possible to run the prod version(s) locally and test them, but then don't you lose the benefits of separating deployment complexity from development complexity since you're now running the prod version locally anyways?

When you start a Polylith codebase from scratch, you can start implementing your business logic and your components before you have even decided how to execute the code in production. The components can talk directly to each other and you only need the development project to begin with. Then you may decide that yo want to expose your functionality as e.g. a REST API, so you now need to create a base and a project where you put that base and your components. Now you have two projects, one for development and one for your REST service, which both looks the same. Some times later you decide to split your single service into one more service. You still have all the tests set up in development but the way you run the code in production has changed. How you execute your code in production is seen as an implementation detail in Polylith. While developing the system, it's really convenient to work with all your code from the single development project, especially if your language has support for a REPL. If you need a production like environment, then you can have that too, but with that said, you will most probably spend most of your time in the single development environment because that's where you are most productive.
I see, that makes sense. Basically, there's nothing stopping you from organising your code to make it easier to follow/step through the business logic (dev), making your changes and then double checking against a slightly different organization of your components and bases (prod), right? You may end up having more projects than deployed services, but that's a non-issue.

Hypothetically, say I'm big corp A with my thousands of developers and hundred of micro services that are just polylith projects. Further, let's say, I have N services that depend on component B. Now, if one team needs to make a breaking change on component B (say you need to change the interface), how would you suggest handling it based on polylith architecture? Would you version each component so that services can pin the component version? Or would you create a new component? Something else? Intuitively, versioning sounds like a mess of thousands of repos. On the other hand, creating a new component would create precedent that might be used to justify an explosion of components that may make your workspace a mess of almost identical components. While refactoring sounds like the way forward here, if you've dug yourself a hole with bad design choices then polylith seems like it would give you more rope to hang yourself with. Otherwise you have to coordinate with all the teams needed to figure out how to modify the N services depending on component B. With typical microservices, my understanding is that this wouldn't happen so long as the service's API remained constant.

Yes, one way of making two projects slightly different is to have two different versions of the same interface (all components expose an interface and they only know about interfaces) and then use different components in the two project (that implements the same interface). The development project can also mimic production to some extent, by using profiles (see https://polylith.gitbook.io/poly/workflow/profile).

You are right that you need to handle breaking changes in some way. Refactor all the code that uses the changed component is probably the best way to go in most cases because it keeps the code as simple as possible. Second best (or best in some situations) is to introduce a new function in the existing component (if it's not a huge change, then a new component can make sense). This can be done in different ways, e.g. by adding one more signature to the function or by putting the new version in a sub namespace, e.g. 'v2' in the interface namespace.

You will face similar coordination problems with microservices too. I worked in a project where we had around 100 microservices. To share code between services we created libraries that was shared across services. Sometimes we found bugs due to some services used an old version of a library. Then we went through all the services to make sure they all used the latest version of all libraries, and that could take two weeks for one person (full time)! You had to go and ask people about breaking changes that was made several weeks ago or try to figure it out yourself.

The alternative is to not share any code and just copy/paste everything (or implement the same shared functionality from scratch every time) but that is probably even worse, because you will not get rid of the coordination needs and if you find a bug in one service, you have to go through all code in all 100 services manually to see if any of the other 99 services contain the same bug, or hope for the best if you don't.

Large monorepos tend to already support such analysis via tools like Bazel.