| Are the benchmarks actually fair? See: https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef... It seems if the serialization object is not a "Fory" struct, then it is forced to go through to/from conversion as part of the measured serialization work: https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef... The to/from type of work includes cloning Strings: https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef... reallocating growing arrays with collect: https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef... I'd think that the to/from Fory types is shouldn't be part of the tests. Also, when used in an actual system tonic would be providing a 8KB buffer to write into, not just a Vec::default() that may need to be resized multiple times: https://github.com/hyperium/tonic/blob/147c94cd661c0015af2e5... |
I can see the source of an 10x improvement on an Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz, but it drops to 3x improvement when I remove the to/from that clones or collects Vecs, and always allocate an 8K Vec instead of a ::Default for the writable buffer.
If anything, the benches should be updated in a tower service / codec generics style where other formats like protobuf do not use any Fory-related code at all.
Note also that Fory has some writer pool that is utilized during the tests:
https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...
Original bench selection for Fory:
Compared to original bench for Protobuf/Prost: However after allocating 8K instead of ::Default and removing to/from it for an updated protobuf bench: