Hacker News new | ask | show | jobs
by ArtB 3822 days ago
> Some validation only makes sense on a specific operation. The object you constructed, due to some business rule, may be invalid to delete but valid to update for instance.

These are two different concepts I feel you are mixing up. An object may be valid, but that doesn't mean it's valid for all operations. For example there is basic validation that a file object is valid if it exists, but if you only have read access to it .append("foo") will fail on it. But it's still a valid file. Your constructor should be checking for the first kind of validity, and your methods for the second.

> What if you receive, from a webservice or something, a valid object format-wise but invalid from the business rule perspective?

That is what you need an [anti-corruption layer](http://www.markhneedham.com/blog/2009/07/07/domain-driven-de...) for. The object makes sense within that other services domain but not within yours. That domain may be closely related to yours, but it isn't you. You need a separate object for that (a simple struct ought to suffice usually). Then you act as a gatekeeper only allowing conforming objects into your system. Basically filter out the shit, and make sure if it makes it past your gate it is clean. Otherwise you might want to use the [state pattern](https://en.wikipedia.org/wiki/State_pattern) to allow for different validation rules of the same object (eg legacy accounts might not have an email address but all new ones must).

> What if there's some info in the database that would allow or not you to construct the object the way you need? You would go to the database and check the info in your constructor?

This is touchy. Sometimes, if it's important enough then yes. Sometimes you can code around it (eg having case numbers auto-generated at time of persistence). Other times you bite the bullet. Other times you rearchitect the data (storing it in memory to do the check quickly). Or as a last resort, except it as a compromise and start coding up some roll-back functionality. Aim for the ideal, and know when to step backwards towards the practical: that is the art of software development.

1 comments

I kinda agree with your first remark. I once acted just as you described as my rule of thumb. But after sometime I found out that deciding if a validation is bound to the operation or not and if the cost (IO) of doing it first hand is worth or not isn't black and white as we want, they all have different tonalities of gray.

As for the second, this anti-corruption layer is dealing with validating the format of the data (typical deserialization problems such as missing properties and invalid types) or the actual content? If it's the first it should be done outside the domain, that's a transport/markup specific thing. If it's the latter that's the domain job. My problem is when you don't instantiate the actual entity (that you know it's well formed) to check its content, dealing with a bag of properties for such thing (that should be done inside the domain) is awful. The domain should deal with its entities and nothing else.

The third, from the experience I have, is a big flashy "no no". Our typical boring OO systems may not be religious as FP is with side-effects, but that doesn't mean that we shouldn't have a little strive to isolate it. Unlike with the actual operations, the instantiation of the entities may take place in a myriad of places. When you mix this with IO side-effects as latency you contribute to create a little monster that you have to peek and poke to find out what's going on.

> If it's the latter that's the domain job.

The domain's job is to represent and encapsulate valid transformations in the domain. If it's not valid within the domain then it doesn't belong in it. If you get a well-formed XML file that has -7 as a social security number, that is not something that your domain has to deal with. It should be caught by the corruption layer. It's not a valid value for your domain. Where I work we regularily build an import domain that allows users to see all the invalid data and manually correct it before allowing it into the rest of the system. Once I have an instance of a Person object I should be able to trust that it is sane.

> The third, from the experience I have, is a big flashy "no no".

It depends; I'd say it's on a case-by-case. Doing a read can be much more acceptable than a write. It's all about trade-offs at that point, but the benefit of knowing "if I have an instance I know that it is sane" means you don't need to litter your code with guards and that pays dividends in maintenance and bugs and agility and testability and, yes, even performance. A big factor is the cost of load. I usually work on low-load (~14 concurrent users) high-importance systems (eg global pricing management) where correctness is at a premium and we usually have system resources to spare. YMMV. As I said: it depends.

> If you get a well-formed XML file that has -7 as a social security number, that is not something that your domain has to deal with.

Agreed. That's a type/format problem. But if, for whatever reason, the domain should process social security numbers that starts with 9 differently that should not be outside of the domain by all means. That is your business rule, you should "trap it" in the domain.

> where correctness is at a premium

The correctness of both are the same. The programming effort and the performance between those differ, but it's never a correctness trade-off.

---

But instead of arguing back and forth, let me give you a problem that I had to deal with before:

Suppose that to have an entity with a state X there must already exist in the system an entity with state Y. Also, to have an entity with state Y there must already exist in the system an entity with state X.

How do you solve this deadlock "the chicken or the egg" problem if you never allow invalid entities to exist?

If you do allow invalid entities it's pretty simple: you instantiate both and handles both, together, to the domain.