Hacker News new | ask | show | jobs
by davemp 1168 days ago
I don’t write many APIs but every JSON schema I’ve created had been automatically generated by openapi tools. Even then I’ve found schemas of very little use, because everything gets validated on deserialization anyways. Client side validation usually already taken care of in practice because users should be serializing using the same type library that deserializes or reading the docs very thoroughly.

JSON is so much more ergonomic than XML as the lingua franca because I can actually read it. That being said I still have my share of problems with JSON.

2 comments

That was the cause of the XML problems - everything was generated.

Me? Schemas are a requirement in areas where you need to integrate over different technology / with different implementations. JSON Schema is in those contexts a bit of a kids toy compared to what XML can do.

We’re using Prisma (https://prisma.io) schemas for a particular data exchange project we’re doing so that we can generate JSON schemas, SQLite schemas, PostgreSQL schemas, etc. We have even found a generator to create basic Elixir code from the Prisma schemas.

We’re not using anything else from Prisma, but if we had to implement something else in JS to talk to a database, that would be a contender for our database interface layer (there are only a couple of others that are even remotely usable, having suffered through the disaster of a Sequelize implementation). We’re more likely to use Elixir and Ecto.

Adding to the problems of generated schemas, Microsoft and Sun both had different views on how they should be generated. I bought into the promise of "build a wsdl" and you can get clients from .NET and Java. I lost all of that buy in. Hard.

I don't know that I can lay the blame on either one of them directly, mind. But the industry definitely suffered from the bad faith cooperation of those companies.

Microsoft, Sun, IBM, HP, Oracle et al explicitly made WDSL and related technologies not interoperate... and that is where JSON + universe has been a delight.
Totally fair. Not sure why I limited my memory to just the two companies.

I'm not clear on how JSON as a format has helped interaction. I'm reminded of like efforts to standardize how information is stored on pages. By and large, that ship sailed and sites that have remained somewhat stable have driven how we look for information on them. All without having to add new schema languages or tools.

I can still read the generated JSON.
> because everything gets validated on deserialization anyways

First, it really depends what you're deserializing with. There is a lot of code out there that just does JSON.parse and then starts accessing the data and then you have an "undefined" get passed deep into the call stack where maybe it explodes or maybe the program just misbehaves. So if you're using a language like JavaScript or Python, then a JSON schema can be used to validate input right away. Think of it like enforcing a pre-condition.

It's also useful in cases where JSON is being used for configuration files. At my company we have quite a few places where JSON files checked-in to a git repo are our source-of-truth which then get POST'ed to an API. We can enforce the schema of those files using pre-commit hooks so no one even wastes time opening a PR that will fail to POST to the API. The same JSON schema is also used by the API to ensure the POST'ed data is correct.

> First, it really depends what you're deserializing with. There is a lot of code out there that just does JSON.parse and then starts accessing the data and then you have an "undefined" get passed deep into the call stack where maybe it explodes or maybe the program just misbehaves.

I disagree, this example is just sloppy programming. Passing unvalidated data deep into a program is bad, I'm not arguing for that. What I'm saying is that you should be converting your unvalidated serialized data into a structured type right on the edge. Your data type/type system should __be__ your schema/validator.

> So if you're using a language like JavaScript or Python, then a JSON schema can be used to validate input right away. Think of it like enforcing a pre-condition.

This is what I do with python+pydantic:

    @dataclass
    class Foo:
        bar: int

    foo = Foo(**json.loads(json_buff))
I'm not the biggest fan of pydantic here because you'll have to handle an exception for invalid data instead of an Option or Result in a better type system. But w/e.

> It's also useful in cases where JSON is being used for configuration files. At my company we have quite a few places where JSON files checked-in to a git repo are our source-of-truth which then get POST'ed to an API. We can enforce the schema of those files using pre-commit hooks so no one even wastes time opening a PR that will fail to POST to the API. The same JSON schema is also used by the API to ensure the POST'ed data is correct.

You can easily do with serdes and a type library as well.

---

I guess schemas may be useful for crossing language boundaries, but you're going to need language specific types/objects at some point so why use schemas directly even then? (I think gRPC may have code gen tools for this purpose).