Hacker News new | ask | show | jobs
by xmodem 635 days ago
> No reverse proxies required!

This is one that has always baffled me. If there's no specific reason that a reverse proxy is helpful, I will often hang an app with an embedded Jetty out on the internet without one. This has never lead to any problems.

Infra or security people will see this and ask why I don't have an nginx instance in front of it. When I ask why I need one, the answers are all hand-wavy security or performance, lacking any specifics. The most specific answer I received once was slow loris, which hasn't been an issue for years.

Is reverse proxying something we've collectively decided to cargo cult, or is there some reason why it's a good idea that applies in the general case that I'm missing?

16 comments

For me, Reverse proxy helps me keep my origin server only for 1 purpose: Serve the Application. Everything else, I can handle with Reverse Proxy including TLS Termination, load balancing, URL rewrites, Security (WAF etc) if needed. Separation of duties for me.

Overall, the benefit is that you can keep your origin server protected and only serve relevant traffic. Also, lets say you offer custom domain to your own customers and in that case, you could always swap out the origin server (if needed) without worrying about DNS changes for your customers as they are pointing to the reverse proxy and not your origin server directly.

TLS should be done with proxies, yes. The Stunnel approach is Gospel.

Similarly if you start load balancing, you can put some server in the middle yes. But the ideal solution is at the DNS level I think, unless there's some serious compute going on (which a website loading a page from disk is not).

URL rewrites should not be a thing unless you have a clusterfuck, and Security is best accomplished in my experience by removing, rather than by adding.

I've worked at a place where even internal traffic that crosses machines needs to be encrypted.

So Ingress -TLS-> Container (pod).

We implemented LinkerD for this, which runs as a sidecar in the pod. Since the sidecar and the main container communicate on the same machine, this is OK.

I run many server programs on my homelab.

Each is running on a different port, but I want them all accessible publicly from different URLs and I only want to expose port 443 to the internet.

I also want to have TLS autorefresh for each domain.

I need a reverse proxy for the former and caddy does both.

If you’re running a single server and that server does TLS termination then you don’t really need a reverse proxy.

Every page off of my (static HTML file!) home page[1] is actually a distinct microservice sitting behind a reverse proxy. I can throw some new experiment together, built it with whatever tooling I want, give it a port number, and let nginx route to it.

It removes a lot of friction from "I wonder if making this service is a good idea?" and because I am self hosting I am not tying myself down to any of the "all in one" hosting platforms.

[1] https://www.generativestorytelling.ai/

Microservice maximalism.
e.g. Virtual hosting as we called it in the Apache days
Virtual hosting is only similar in that it allows you to serve content based on the requested FQDN (or, indeed, destination port of the request).
You forgot the original need: share a single IPv4 among different services.

If going IPv6-only, the need for a reverse proxy is seriously lowered. You could spin multiple servers up (even on different machines), listening to 443. Have each service handle its certificate renewal, etc.

> You forgot the original need: share a single IPv4 among different services.

That "original need" is exactly what GP is talking about.

Right, indirectly (single port). I was spelling it out.
For most of my deployments, the performance impact of a reverse proxy is negligible, I have the configs pre-prepared and it allows me to add TLS termination, URL rewrites or other shenanigans without much effort in the future. So for me, it's mostly a habit that has paid out so far.
IME, using an Nginx or WAF layer lets the "ops people" make changes to the things you mention (TLS config, URL rewrites, etc.) without getting the "app people" involved. There's a bit of "Conway's Law" going on here, depending on the reporting structure and political makeup of the organization.
My answer applies to a number of types of servers that sit in front of web applications. You asked about security and performance. I’ll give you a few ways that an extra box can help in those areas.

For security, you want a strong OS with this little code as possible in your overall system. Proxy-style apps can be very simple compared to web, application servers. They can filter incoming traffic, validate the input, or even change it to something safer (or faster) to parse. They can also run on OS’s that are harder to attack: OpenBSD; GenodeOS; INTEGRITY-178B. On availability, putting load-balancing, monitoring, and recovery in these systems is often safer since app servers are more likely to crash.

On performance, the first benefit is that the simple, focused app can have a highly-optimized implementation. From there, one can use hardware accelerators (CPU or PCI) to speed up compression or encryption. Also called offloading. The most, cost-effective setup has many commodity servers benefiting from a few, high-cost servers capable of offloading. Some have load-balancing to route incoming traffic to servers able to handle it best to minimize use of costly resources.

So, there’s a few ways that proxy-type servers can help in security and performance.

I don’t really care think there is a general case for all servers.

For the minimal case you don’t need it, but in production (with a single host) it allows for rolling releases, compression, TLS, fast static file serving, potentially A/B testing capabilities.

The layer of indirection between the request and your server can be very useful.

> but in production (with a single host) it allows for rolling releases

I mean for me this is pretty much already enough of a reason to always put an rp ahead of my apps. It's requires minimal setup, most of the tools are fire and forget so I see no real downsides. But having the ability to just point it somewhere else, or to split traffic across app replicas, is more than enough.

caching -- google changed the expectations of millions
I think people do it out of habit at this time. In many cases it makes sense to handle TLS termination and compression, but in other instances it really is there for no reason.

Proxying is always less-performing than serving directly since you add another layer in between, right? Or am I missing something?

Jetty implements both TLS and compression, though in environments where I don't already have automated certificate issuance infrastructure in place I have occasionally deployed caddy as a reverse proxy just for the TLS termination.
Most web applications are not written in Java. NGINX also allows static assets to be served directly while side-stepping the application server. This is a boon for interpreted languages.
And that is a perfectly valid performance reason for adding an nginx layer in front. It does not IMO justify it in the general case however.
I agree with fny's comment, and add that most "application servers" don't bother with things like supporting sendfile(2); e.g. when hosting a Python application, you need to add something like Whitenoise, and integrate it with your application somehow; that's extra development work that is sometimes easier to throw over the fence at the sysadmin (especially since the sysadmin will usually already have that part of their job automated).

I'd also say that there is no such thing as a "general case"; I've launched and/or supported countless (must be hundreds?) of web projects and even the "simple" ones were each a bit of a snowflake.

https://man7.org/linux/man-pages/man2/sendfile.2.html

https://whitenoise.readthedocs.io/

But that is the general case. Most web apps are written in interpreted languages like JavaScript which benefit from a reverse proxy. If I remember correctly, NGINX became popular because of Rails.

Maybe in Java-land it’s overused, but everywhere else it makes sense.

Something like nginx will likely perform far better at serving static content and other cacheable requests. Also allows you to run two binaries at once for a rolling update.
> likely perform far better at serving static content and other cacheable requests.

But at the cost of having a separate build step that deploys your static assets somewhere. Jetty is actually pretty fast - I've built some fairly high-volume internal apps this way.

> Also allows you to run two binaries at once for a rolling update.

You don't necessarily need an extra reverse proxy layer for this, though I will concede in some environments it's probably the easiest way to achieve it.

You don't necessarily need to deploy your static content anywhere, you can just set nginx to cache your content.

Also, most other rolling update solutions will end up being more complex than having a reverse proxy. What do you have in mind that would be simpler? NixOS?

You're missing vhosts, TLS, caching, logging, and log analysis, access control, rate limiting, custom error messages, metrics, etc.
At one job, Nginx facilitated blue-green deployments. I would spin up a 2nd app server and have Nginx cut-over to it with <1 second of downtime. If anything went wrong, the rollback plan was to only roll back the Nginx config.

I automated all that with a few scripts that included sanity checks with `nginx -t`. After the update looked good I would shut down the old app server without any time crunch. Only the Nginx config was time-sensitive.

I'm not sure if you can do that without some kind of reverse proxy as an abstraction layer. At least a TCP-level proxy.

And as everyone said, virtual hosting.

In theory, you can do even better with no reverse proxy: hand down the open sockets to the new version of your application, zero downtime at all. (Nothing prevents you from having a reverse proxy in front while doing that).
> Is reverse proxying something we've collectively decided to cargo cult, or is there some reason why it's a good idea that applies in the general case that I'm missing?

It's a matter of risk management. On the one hand is your service that speaks http. Maybe it uses a good library for it, maybe not - but even if the library is good are we sure you used it correctly? Even if you used it correctly, has it been as thoroughly tested and proven as nginx?

On the other hand you have nginx - a deeply understood technology that has served trillions and trillions of web requests, has proven itself resillient against attacks again and again, and has been reviewed with a fine-toothed comb by security engineers deeply for years.

So just from the starting point, your software is riskier. Even if you're the best software engineer who's ever lived, it's a higher risk profile to deploy new unproven software than the one that's been battle tested for decades.

It's also a matter of mitigation - if your software does have a vuln, are you going to notice it? Even if you do notice it, how long til you understand the problem and fix it? What to do in the time between discovery and deploying the fix? On the other hand if there's an nginx vuln, there are almost certainly juicier targets than your software to exploit first, and the bug and the fix are far more likely to be found and deployed long before someone even tries it for your site.

It's a lot easier to isolate and de-privilege your reverse proxy that needs to do nothing more than speak http/https with the outside world and some local listeners.

The url-specific web servers you're proxying tend to need a whole lot more, at least filesystem access to serve html content, at most program execution like CGIs and interpreters.

Separating these concerns makes a lot of sense, and brings little to no overhead by modern standards.

Reverse proxy allows some operational flexibility:

1) you can share multiple apps or sites with one server listening on port 443/80. 2) You can redirect to another backend on your infrastrcture 3) You can enforce certain login/sso/restrictions 4) You can configure all these things in one place.

Of course, if you don't need all that, then it's somewhat moot.

Amusingly, slowloris is still an issue for some Rust (hyper) based servers. There’s been some movement on it lately - and I’m typing this in a free moment, so maybe it’s finally fixed and someone can correct me - but it’s kind of lurking there and throwing Nginx in front of an e.g Axum deploy is still somewhat necessary.
> I will often hang an app with an embedded Jetty out on the internet

So you are using a proxy server, just an embedded one. Most prefer simply prefer not to bundle their application with one.

Reverse proxy is the OG sidecar. You get N number of useful functionalities that doesn't need to live in your primary app, for example: TLS cert handling.
> Is reverse proxying something we've collectively decided to cargo cult

Yeah, that’s ridiculous. “Cargo culting” is when people imitate processes without understanding the underlying purpose, but reverse proxying is widely used for valid reasons—like security, load balancing, caching, SSL termination, etc. It’s not just mindless mimicry. Dismissing a best practice as “cargo culting” because they don’t understand it is lazy. Just because it’s common doesn’t mean it’s done without purpose. Worst case? You get people following a pretty good practice.

> slow loris,

Really? I am curious.

You are not talking of monkeys?