Hacker News new | ask | show | jobs
by Zinggi 2253 days ago
You shouldn't validate json. You should parse it, e.g. transform it into your desired data structure if it comes in in a valid shape.

https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

I really like Elms way of Json decoding for this.

1 comments

I read the post. It seems to be mainly applicable to statically-typed languages. Apparently, the core problem with validation is "shotgun parsing":

> Shotgun parsing is a programming antipattern whereby parsing and input-validating code is mixed with and spread across processing code—throwing a cloud of checks at the input [...]

> The problem is that validation-based approaches make it extremely difficult or impossible to determine if everything was actually validated up front [...]

Err, what? So validating against a well-defined schema won't necessarily cause this. But okay, again, I can buy the benefits for statically typed languages. There's more though:

> Don’t be afraid to parse data in multiple passes. Avoiding shotgun parsing just means you shouldn’t act on the input data before it’s fully parsed [...]

> Use abstract datatypes to make validators “look like” parsers. Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range.

I've experienced this in Java/Jackson (which btw, proves this is not exactly new, sexy, or rare in the statically typed world).

What is the suggestion for a dynamic language? In e.g. Javascript, classes will only get you so far, and seems needlessly heavyweight if you aren't going to get other benefits of type-safety. I really don't see how this is helpful, even after putting in the time to investigate this.

In JS, my preferred way of handling json is heavily inspired by the elm json decoding way.

I want my decoders to essentially do 2 things:

1. Transform received data into a data structure that is best suited for my app. This might involve converting lists to objects, objects to sets, parse dates etc.

2. Only succeed on valid data

This way, my application never has to deal with bad data. Also, I get to design the data structures I use, not the APIs I use. By only validating and not transforming, you are pushing more advanced validation further away from where the data was received (since you'll need to transform the data at some point anyway).

As for libraries, I want composable parsers. For TS I'd probably use: https://github.com/paperhive/fefe/