|
The big secret is that sidecars can only help so much. If you want distributed tracing, the service mesh can't propagate traces into your application (so if service A calls service B which calls service C, you'll never see that end to end with a mesh of sidecars). mTLS is similar; it's great to encrypt your internal traffic on the wire, but that needs to get propagated up to the application to make internal authorization decisions. (I suppose in some sense I like to make sure that "kubectl port-forward" doesn't have magical enhanced privileges, which it does if your app is oblivious to the mTLS going on in the background. You could disable that specifically in your k8s setup, but generally security through remembering to disable default features seems like a losing battle to me. Easier to have the app say "yeah you need a key". Just make sure you build the feature to let oncall get a key, or they will be very sad.) For that reason, I really do think that this is a temporary hack while client libraries are brought up to speed in popular languages. It is really easy to sell stuff with "just add another component to your house of cards to get feature X", but eventually it's all too much and you'll have to just edit your code. I personally don't use service meshes. I have played with Istio but the code is legitimately awful, so the anecdotes of "I've never seen it work" make perfect sense to me. I have, in fact, never seen it work. (Read the xDS spec, then read Istio's implementation. Errors? Just throw them away! That's the core goal of the project, it seems. I wrote my own xDS implementation that ... handles errors and NACKs correctly. Wow, such an engineering marvel and so difficult...) I do stick Envoy in front of things when it seems appropriate. For example, I'll put Envoy in front of a split frontend/backend application to provide one endpoint that serves both the frontend or backend. That way production is identical to your local development environment, avoiding surprises at the worst possible time. I also put it in front of applications that I don't feel like editing and rebuilding to get metrics and traces. The one feature that I've been missing from service meshes, Kubernetes networking plugins, etc. is the ability to make all traffic leave the cluster through a single set of services, who can see the cleartext of TLS transactions. (I looked at Istio specifically, because it does have EgressGateways, but it's implemented at the TCP level and not the HTTP level. So you don't see outgoing URLs, just outgoing IP addresses. And if someone is exfiltrating data, you can't log that.) My biggest concern with running things in production is not so much internal security, though that is a big concern, but rather "is my cluster abusing someone else". That's the sort of thing that gets your cloud account shut down without appeal, and I feel like I don't have good tooling to stop that right now. |
Why not? AFAIK traces are sent from the instrumented app to some tracing backend, and a trace-id is carried over via an HTTP header from the entry point of the request until the last service that takes part in that request. Why a sidecar/mesh would break this?