Hacker News new | ask | show | jobs
by alexk 1500 days ago
Sasha, CTO@Teleport here. Congrats on the launch!

RE: Teleport design

Teleport does not require a centralized proxy, because it is based on certificate authorities. You can issue a certificate with or without Teleport proxy and access any cluster that trusts that certificate directly.

Because of this design you can have a completely decentralized system, with cold storage for your CA, HSM or any parallel system issuing certificates. There is also no need to revoke your credentials, because your certs are short-lived and bound to the device and cluster, so there is less opportunity for pivot attacks.

RE: GRPC

First version of Teleport also had HTTP/JSON REST API, but we have migrated to GRPC to support events streaming and have one type system across multiple languages and services boundaries.

Re: Managed clusters

Teleport supports all CNCF-compatible clusters, including AKS, EKS and GKE out of the box.

2 comments

Great point on GRPC having better support for event streaming! We originally built Infra to have a GRPC API, but many users we spoke to didn't yet have load balancers or ingress controllers that supported the GRPC protocol (e.g. one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it).

We wanted to remove as many hurdles as possible for teams to deploy Infra in their environments. Event streaming will invariably become an important part of the API (e.g. for features like audit logs), and we'll consider GRPC again for internal components of Infra.

RE using Teleport without the proxy, how would a target cluster's Kubernetes API server (e.g. an EKS cluster) verify certificates without Teleport's proxy?

> one user had to consider upgrading their AWS Load Balancer controller to put Infra behind it

Huh?

The AWS load balancer for which gRPC is relevant is their Application Load Balancer (ALB), which would require you to terminate TLS at the ALB and does not support mutual TLS (which is how short-lifetime client certificates work in this case). To the best of my knowledge, you can't pass through a client-key-encrypted gRPC session through an ALB (maybe I'm wrong?).

Typically this requires an NLB, which will treat all TCP traffic (REST and gRPC) the same, so gRPC wouldn't require an upgrade?

Re: GRPC

My bet that you'd migrate to GRPC eventually as you scale :) I like the simplicity of HTTPS/JSON API as well, but it just broke down for us at a certain scale point.

Re: Teleport with EKS

True, CNCF clusters support mTLS out of the box, but EKS hides the endpoint and does not let you provision CA to trust. You will have to run teleport proxy inside the EKS cluster to translate mTLS to EKS IAM auth. However, you don't have to have a centralized proxy, you can just deploy Teleport proxy agent in each cluster and hide your K8s endpoint.

You also don't have to have a single Teleport proxy to do that.

Thanks! Curious, where did HTTP+JSON break down for you? Was it specifically around audit/event streaming? This would be helpful as we consider building out future updates to Infra, especially considering tools like Kubernetes have put HTTP+JSON APIs the test (at least in their user-facing APIs)

Indeed! EKS + others don't allow custom authentication methods or allow you to use an external CA for the cluster. Running a proxy agent in each cluster makes sense and is similar to how Infra approaches it: I hadn't seen that configuration in your architecture pages!

Have you considered distributing certificates signed by the cluster CA itself (to avoid proxies altogether)? In 1.22 onwards there's a new ExpirationSeconds field when creating a certificate signing request: https://github.com/kubernetes/enhancements/issues/2784 . I imagine this will be supported by all the hosted Kubernetes services - we've been watching this closely.

this looks like a centralized proxy to me: https://goteleport.com/docs/architecture/proxy/

are you saying that because you can have multiple proxies, they aren't centralized? or that at least this is one mode you can use, but the standard one is using a proxy?

Teleport consists of a couple of components:

* Proxy is used to handle SSO, Web UI and intercept traffic for session capture. You can have one proxy per your organization, multiple proxies or, if you don't want to intercept traffic, no proxies at all.

* Auth server is used to issue certificates and send audit logs and session recordings to external systems.

* Nodes (end system agents) sometimes are helpful, but not required. For example, if you want to capture system calls in your SSH session, you can deploy node. Or you can use OpenSSH with Teleport if you wish.

Because Teleport is based on certificate authorities, the following deployments are possible:

* One, "centralized" HA pair of proxies intercepting all your traffic (K8s, databases, web, etc). This is actually helpful for many cases, as you have just one entry point in your system to protect, vs many.

* Multiple, "decentralized" proxies in multiple datacenters. This is helpful for large organizations with many datacenters all over the world.

* No proxies at all. You can issue certificates with or without Teleport and reach your target clusters directly, as long as they trust the CA. It's a bit harder for managed K8s, but easy to do with self-hosted K8s, SSH, Databases, etc that support mTLS cert auth. This is super helpful for integrations with larger echo system - any system that supports cert auth should work with Teleport out of the box.

* You can have one auth server HA pair managing a single certificate authority.

* You can have multiple, independent auth servers (teleport clusters) with certificate authorities and trust established between them.

* You can use your own CA tooling with Teleport.

The way we think about Teleport is that it's a combination of certificate authority management system, proxies (intercepting traffic and recording sessions) and nodes (for some services, like SSH providing advanced auditing capabilities with BPF).

You can combine those components, or replace them with whatever makes sense.

does the standard deployment use a centralized proxy?

like it's your Basic Architecture in the diagram in your docs. so i feel like i'm being put on.

Sorry you feel that way!

We haven't counted, but my bet is that most smaller deployments just use the single proxy.

I also know that most larger deployments use multi-DC and multi-cluster design with independent CAs for availability and latency.

that answers my question perfectly. thank you!