Hacker News new | ask | show | jobs
by BlueZeniX 5143 days ago
I seriously wonder how it's possible that in his tests JSON got smaller than MsgPack.

MsgPack is so close to JSON in structure, but very compact (1 byte type header, small numbers type and payload combined) it doesn't make any sense.

1 comments

Consider for example a message with a double in it. A double in msgpack is 9 bytes (according to their spec[1]). In json it is just however many bytes it takes to represent the "number" in ascii. So if the number happens to be a very small double, such as 3.0, it may be just 3 bytes to store the double in json (depending on the encoder?), as apposed to 9 bytes for msgpack. Something similar could be said for large intergers too. '3' is only one byte in json, but would be 5 bytes in msgpack when trying to encode an int32.

That something similar is occurring to the messages in the talk, is the only explanation I could think of anyway... Looking at the thrift description, it does appear that there are int32s and doubles in the messages.

[1]: http://wiki.msgpack.org/display/MSGPACK/Format+specification...

One thing that is important to point out that is that JavaScript (and thereby JSON) doed not separate integers from floating-point numbers; so if you have the double 3.0 it will be represented in JSON as "3", not "3.0". If MessagePack does not have a mechanism for dropping down to int32 for integral doubles, then you will actually see a 9:1 difference for this specific case (although if you are often storing integers in your doubles maybe you should be using int32 anyway, in which case 3.1 may have been a better general example than 3.0 ;P).
I think that depends on the library you are using. 3.0 (not truncated to simply 3) is certainly a valid json number. However, point taken that it is a meaningful distinction for javascript.
Oh, it is certainly valid; it is just that for purposes of comparing relative space you should be looking at the shortest representation. "3.0000000" is also valid, and is the same 9 bytes as MessagePack, but there is no reason you'd use that to encode this specific number ;P.
True.

The msgpack serializer was probably used in a suboptimal way. The integer 3 can be encoded by msgpack in one byte as well. I never use doubles which is probably why I didn't think of this case.

Integers are actually always <= bytes in msgpack than the JSON ascii representation.

Anyway - benchmarking is hard.