|
To be fair, xml-rpc truly is terrible. To start with, XML is extremely verbose and you add a lot of network overhead for that, plus a lot of parsing overhead, which makes rpcs significantly more expensive than they should be. Use https://capnproto.org/ instead. (Based on protobuff, which is an open sourced version of what Google uses internally.) As far as I'm concerned, the only valid reason to use xml-rpc is because you're interfacing with someone else's system and they have chosen to use it. Moving on, here is an excellent example of an important architecture system that almost everyone gets wrong. In any distributed system you should transparently support having rpc calls carry an optional tracing_id that causes them to be traced. Which means that you log the details of the call with a tracing id, and cause all rpcs that get emitted from that one to carry the same flag. You then have a small fraction of your starting requests set that flag, and collect the logs up afterwards in another system so that you can, live, see everything that happened in a traced rpcs. To make this easy for the programmer, you build it in to the rpc library so that programmers don't even have to think about it. You then flag a small random small fraction of rpcs at the source for tracing. This minimizes the overhead of the system. But now when there is a production problem that affects only a small percentage of RPCs you just look to see if you have a recent traced RPC that shows the issue, look at the trace, and problems 3 layers deep are instantly findable. Very few distributed systems do this. But those that do quickly discover that it is a critical piece of functionality. This is part of the secret sauce that lets Google debug production problems at scale. But basically nobody else has gotten the memo, and no standard library will do this. Now I don't know why they reinvented xml-rpc for themselves. But if they had that specific feature in it, I am going to say that it wasn't a ridiculous thing to do. And the reason why not becomes obvious the first time you try to debug an intermittent problem in your service that happens because in some other service a few calls away there is an issue that happens 1% of the time based on some detail of the requests that your service is making. |