Hacker News new | ask | show | jobs
by ogrisel 403 days ago
You cannot share arbitrarily structured objects in the `ShareableList`, only atomic scalars and bytes / strings.

If you want to share structured Python objects between instances, you have to pay the cost of `pickle.dump/pickle.dump` (CPU overhead for interprocess communication) + the memory cost of replicated objects in the processes.

3 comments

We need a dataclass-like interface on top of a ShareableList.
Actually, ShareableList feels like a tuple really (as it’s impossible to change its length). If we could mix ShareableList and collections.namedtuple together, it would get us 90% there (99.9% if we use typing.NamedTuple). Unfortunately, I can’t decipher either one [1, 2] from the first glance – maybe if I get some more sleep?

[1]: https://github.com/python/cpython/blob/3.13/Lib/collections/...

[2]: https://github.com/python/cpython/blob/3.13/Lib/typing.py#L2...

I can fit a lot of json into bytes/strings though?
If all your state is already json-serializable, yeah. But that's just as expensive as copying if not more, hence what cjbgkagh said about flatbuffers.
oh nvm, that doesn't solve this either
Perhaps flatbuffers would be better?
I love learning from folks on HN -- thanks! Will check it out.
Take a look at https://capnproto.org/ as well, while at it.

Neither solve the copying problem, though.

Ah, I forgot capnproto doesn't let you edit a serialized proto in-memory, it's read-only. In theory this should be possible as long as you're not changing the length of anything, but I'm not surprised such trickery is unsupported.

So this doesn't seem like a versatile solution for sharing data structs between two Python processes. You're gonna have to reserialize the whole thing if one side wants to edit, which is basically copying.

let me introduce you to quickle.
What’s the point? The whole idea is to share an object, and not to serialize them whether it’s json, pickle, or whatever.
I mean, the answer to this is pretty straightforward -- because we can, not because we should :)
That’s even worse than pickle.
pickle pickles to pickle binary, yeah? So can stream that too with an io Buffer :D
So don’t do that? Send data to workers as primitives, and have a separate process that reads the results and serializes it into whatever form you want.