|
|
|
|
|
by autocracy
3393 days ago
|
|
A side note: at least in python, I benchmarked MsgPack as 3 times faster than JSON, and a whopping 850 times faster to read than YAML. It seems unlikely that people will ever be editing this file by hand, so for unstructured data, especially where playlists can get very large, I suggest MsgPack over YAML. BUT... you have a defined schema, so it's probably to your advantage to use a storage format with a defined schema: ProtoBuf or Thrift. That would mean somebody trying to use your code would already have generated objects in their language. As for the hashing algorithm, this is a good use for MD5 -- cheap and fast. You're unlikely to be concerned about somebody actively trying to generate the same checksum for two music files. For non-security (integrity verification) purposes, MD5 is still very appropriate. |
|
About MD5, I was worried about a case where a service that serves user-submitted files would be exploited by MD5 collisions, leading users to open files that might exploit decoder bugs to execute code. Far-fetched, I know, but the tradeoff didn't seem worth it. I'm not married to that decision, though.