|
|
|
|
|
by kentonv
2297 days ago
|
|
Many (most?) applications do not actually care whether a byte blob of text is structurally valid UTF-8. They are either passing it around as an opaque byte blob, or already applying much stricter application-specific validation. Validating UTF-8 automatically at the serialization layer is a huge waste of cycles, especially in a big distributed system. |
|
However if you’re accepting MessagePack encoded data from insecure systems (such as end users) then you absolutely should be validating your input somewhere along the pipeline and it’s usually better to do that early on.
Also it’s not generally the distributed systems you worry about when it comes to this specific degree of micro-optimisation (which is basically what this is). It’s the monolithic ones. Distributed architecture is meant to solve various problems (for example but not limited to, high availability, reduced geographical latency, single site but running on cheaper commodity hardware, etc) but often at the cost of CPU cycles. Whereas your monolithic infrastructures where you have fewer servers (such as Stack Overflows set up) would be greatly more dependant on reducing computational overhead where corners could be cut. However they’d also be significantly less likely to need networked RPCs via MessagePack anyway (simply due to the monolithic design of their architecture).