Hacker News new | ask | show | jobs
by kortex 1400 days ago
Use pydandic for serde/IO validation, and mypy --strict. And minimize cheating e.g. Any. I think you'll find it extremely stable and a totally different beast than usual python.
1 comments

I wrote typedload before pydantic was a thing. It's faster and actually works well with mypy and python's type system.

Pydantic requires inheritance and requires a modified mypy because it uses types its own way rather than the python way.

Oh, typedload is faster than pydantic, despite being pure python instead of an .so file.

I would be very interested in seeing some benchmarks!

I'm pretty sure that serde, the rust library that Pydantic uses, is faster than the JSON parser written in C in the python standard lib, so I'd be very surprised if your pure python validation library beat pydantic.

https://ltworf.github.io/typedload/performance.html

At the bottom there are instructions on how to run the benchmark locally.

I read more about your repo, and it looks like typedload only does Python lists/dicts -> typed python objects(like named tuples, data classes, attrs classes, etc). Ie, the benchmarks don't include the time required to parse the JSON.

So I modified the benchmarking code to include loading JSON[0], and your library still came out on top!

But, it turns out I was wrong about pydantic using serde under the hood. Pydantic version 2 will. And the maintainer aims for it to be about 10x faster [1] than version 1.

Nonetheless, this was definitely a surprise for me, and if I ever go back to using Python, I'll definitely check your library out!

I'm curious, why do you think pydantic took off and your library didn't? It looks like your library is both faster and easier to use to me.

[0] https://github.com/davidatsurge/typedload/pull/1/files?diff=...

[1] https://github.com/pydantic/pydantic-core#pydantic-core

But you could use whatever module to load json and then use typedload.

Having them integrated could be advantageous to save memory and avoid loading the full json first. Like loading the objects directly to their final destination as the json gets parsed. But that would be kinda complicated.

I have absolutely no idea why mine isn't so used, but my model of GPL license + pay me to get LGPL license probably doesn't help. But is not a factor if used for internal stuff.

apischema is my second favourite one and it also doesn't have many users. However it came later so that's an easier explanation.

I recently tried a similar library called jsons. I wanted to benchmark it but it was too buggy to do a meaningful comparison. Of course it has 8x more downloads :D I guess the users either have very basic use cases or prefer working around the issues rather than trying a different library.

Finding a decent library is not easy. I'm trying to convince my coworkers to just drop and rewrite a bad golang library they are using and working around a lot.

Ahh. It would be the license if I were to make the decision. Only because, if I were to use Python, performance must not be a big concern, so I would go for an alternative (albeit slower) library (which there isn't a shortage of) instead of paying for the non GPL license.

I think something else that might help is if you make it more prominent (if it's visible right now at all, since I can't find it) the message that tells people, at least, "contact me to arrange a different license", or at best, a pricing model.