Hacker News new | ask | show | jobs
by vendakka 4155 days ago
Transport Layer

I'd use HTTP with SSL on port 443. The ports 443 and 80 are whitelisted by a number of corporate firewalls and the like. I've run into problems in the past caused by blocked ports. In addition there is a large amount of existing software which already understands HTTP for things such as load balancing, caching, etc.

Encoding

I would recommend using JSON to begin with. It is extremely flexible and coupled with the compression abilities you usually find built into HTTP systems it will also be compact. I've used Protobufs (at Google) and Thrift (at my previous job) before, but find that the primary use case ended up being code generation and serialization using their respective description languages. This might seem absurd, but in my experience, since these don't allow you to embed the schema in the message their main advantage was reduced to having a very compact representation with some automatic schema evolution capabilities thrown in.

JSON might lead to some additional initial typing but provides very good performance in most languages. Storage of JSON messages can also be very compact if you store the compressed form. In addition most JSON libraries allow you to be backwards compatible when decoding messages (i.e adding new fields, etc).

When compressing, gzip can be expensive in terms of CPU cycles so consider using Snappy[1] from Google.

I've also heard both good and bad things about Avro but I haven't used it personally.

Most people claim JSON is schema-less but this misses the point a little bit. The JSON schema is the code you write detailing the various structs/objects that the JSON objects map to. Merely because there isn't a specific language used to lay out the schema does not immediately mean you cannot have a JSON schema. There just isn't a standard so use the library that works best for your language.

[1] https://code.google.com/p/snappy/

1 comments

right, forgot about Avro.

s/“^/or avro”/