Hacker News new | ask | show | jobs
by stock_toaster 5148 days ago
Consider for example a message with a double in it. A double in msgpack is 9 bytes (according to their spec[1]). In json it is just however many bytes it takes to represent the "number" in ascii. So if the number happens to be a very small double, such as 3.0, it may be just 3 bytes to store the double in json (depending on the encoder?), as apposed to 9 bytes for msgpack. Something similar could be said for large intergers too. '3' is only one byte in json, but would be 5 bytes in msgpack when trying to encode an int32.

That something similar is occurring to the messages in the talk, is the only explanation I could think of anyway... Looking at the thrift description, it does appear that there are int32s and doubles in the messages.

[1]: http://wiki.msgpack.org/display/MSGPACK/Format+specification...

2 comments

One thing that is important to point out that is that JavaScript (and thereby JSON) doed not separate integers from floating-point numbers; so if you have the double 3.0 it will be represented in JSON as "3", not "3.0". If MessagePack does not have a mechanism for dropping down to int32 for integral doubles, then you will actually see a 9:1 difference for this specific case (although if you are often storing integers in your doubles maybe you should be using int32 anyway, in which case 3.1 may have been a better general example than 3.0 ;P).
I think that depends on the library you are using. 3.0 (not truncated to simply 3) is certainly a valid json number. However, point taken that it is a meaningful distinction for javascript.
Oh, it is certainly valid; it is just that for purposes of comparing relative space you should be looking at the shortest representation. "3.0000000" is also valid, and is the same 9 bytes as MessagePack, but there is no reason you'd use that to encode this specific number ;P.
True.

The msgpack serializer was probably used in a suboptimal way. The integer 3 can be encoded by msgpack in one byte as well. I never use doubles which is probably why I didn't think of this case.

Integers are actually always <= bytes in msgpack than the JSON ascii representation.

Anyway - benchmarking is hard.