Hacker News new | ask | show | jobs
by ajamesm 3370 days ago
Heh, people sure to use it like it's the answer to everything, and "don't use it" stops being an option if you're writing to a JSON-based interface.

JSON's fine if you don't have any requirements around data serialization and you want it to "just work" for your webapp, but there's a lot of tech debt inherent in it.

So you dump a report in JSON format and back it up to S3. S3 costs are growing faster than you thought, so you gzip deflate all of it. Everyone has to go patch their JSON deserialization to detect gzip extensions. Whatever, just growing pains.

Then another team tries to read the reports, and they're getting errors because your definition of an interface is "we'll just use JSON, the keys are human-readable".

You define a formal API for your report format and in doing so you realize the need for versioning attached to your report schema, so you wrap all your JSON objects with types and version annotations. You could define a central repository for these schemas, but it's easier to just bake them into the top-level response. Everyone agrees that this is "lightweight" and not "centralized".

Now you're storing reports where each sub-object has its own annotations, or you're defining an entire schema at the object level. Object deserialization is taking 200ms even for small payloads, because of all the validation callbacks you're firing, and developers are now "performance hacking" their components by disabling validation callbacks. Now you have all the space overhead of schematic annotations with none of the benefits.

In order to adhere to the API, either teams are writing separate serialization libraries, or you form a team to maintain them as infrastructure, which is a great idea except that the horse already left the barn 2 years ago.

Without even realizing it, you've reinvented XML and XSD. And I don't really like XML either but at least you have to be honest about what you're getting yourself into.

1 comments